CANADIAN 
OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOL. VIII - NO. 2 OF MICHIGAN 
1956 WA 


MAT MH. cCUON, 


. ee LIBRARY 
On connections of Cartan Shéshichi Kobayashi 145 


A generalization of an inequality of Hardy 

and Littlewood K. T. Smith 157 
Harmonic and analytic functions of several 

variables and the maximal theorem of Hardy 

and Littlewood H. E. Rauch 
An extension problem for functions with monotonic 

derivatives W. B. Jurkat 
On relationships amongst certain spaces of 

sequences in an arbitrary Banach space CC. W. McArthur 
On a quasilinear equation Richard Bellman 
Modified boundary value problems for a quasi- 

linear elliptic equation G. F. D. Duff 
Some algebraic properties of asymptotic power 

series T. E. Hull 
Asymptotic expansions Leo Moser and Max Wyman 
The asymptotic series for a certain class of 

permutation problems N. S. Mendelsohn 
Maximal determinants in combinatorial 

investigations H. J. Ryser 
A class of almost alternative algebras L. A. Kokoris 
Orthogonal isomorphic representations of free 

groups J. de Groot 
On a theorem of Baer and Higman Sean Tobin 
Commuting rings of endomorphisms C. W. Curtis 
The covering of space by spheres E. S. Barnes 


) cr 
‘hel, 
~wo i300 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 


University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, A. Gauthier, R.D. James, R. L. Jeffery, 
G. de B. Robinson, H. Zassenhaus 


with the co-operation of 


H. Behnke, R. Brauer, D. B. DeLury, G. F. D. Duff, I. Halperin, 
W. K. Hayman, J. Leray, S$. MacLane, P. Scherk, B. Segre, 
J. L. Synge, W. J. Webber 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, H. S. M. Coxeter, University of Toronto. Everything 
possible should be done to lighten the task of the reader; the notation 


and reference system should be carefully thought out. Every paper 


should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $8.00. This is reduced to $4.00 for individual members of 
recognized Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 
University of Alberta 
Assumption College University of British Columbia 
Carleton College Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Queen’s University Royal Military College 
St. Mary’s University University of Toronto 
National Research Council of Canada 


and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 

















ON CONNECTIONS OF CARTAN 
SHOSHICHI KOBAYASHI 


Introduction. Consider a differentiable manifold M and the tangent 
bundle T(M) over M, the structure group of which is usually the general 
linear group G’. Let P’ be the principal fibre bundle associated with 7(M). 
Consider the fibre F of T(M) as an affine space, then we have acting on F 
the affine transformation group G, which contains G’ as the isotropic subgroup. 
Following the idea of Klein, it is more natural to take G as the structure group 
of the bundle 7(M). Let P be the principal fibre bundle associated to T(M) 
with group G. 

In the classical theory of affine connections, there are two points of view. 
The one is due to Levi-Civita, who considered each tangent space of M as a 
vector space and explained a connection as a law of parallel displacement of 
vectors along curves. From the point of view of the theory of connections in 
fibre bundles, a connection in the sense of Levi-Civita is a connection in the 
principal fibre bundle P’ with group G’. The other point of view is due to E. 
Cartan. Following him, each tangent space of M is an affine space on which the 
affine transformation group G acts transitively, and an affine connection is a 
law of development of tangent spaces along curves; it is a connection in P. 

The idea of Cartan was rigorously established by Ehresmann (3) as follows. 
Consider a fibre bundle B satisfying the conditions of soudure (see §2); the 
fibre F is homeomorphic to a homogeneous space G/G’ and the structure group 
G of B can be reduced to G’. As in the case of tangent bundle, we obtain two 
principal fibre bundles P and P’ with group G and G’ respectively and P’ is 
contained in P. A connection in P is called a connection of Cartan, if it satisfies 
the following condition: the differential form w defining the connection gives an 
absolute parallelism on P’. The importance of this condition was shown in 
previous papers (4; 5). 

It is known that there is a correspondence between affine connections in the 
sense of Cartan and those in the sense of Levi-Civita; there is a canonical 
one-to-one correspondence between the set of connections in P and the set 
of connections in P’ (7). 

The purpose of the present paper is to show that there exists a one-to-one 
correspondence between the set of Cartan connections in P and the set of 
infinitesimal connections in P’, if the homogeneous space F = G/G’ is weakly 
reductive (see §2). We shall show also that in such a case the torsion forms can 
be defined. The last section will be devoted to the application to invariant 
connections. 


Received April 25, 1955. 











146 SHOSHICHI KOBAYASHI 


1. Tangent vectors. The manifolds and the mappings considered in 
this paper are all of class C”. For the definition of tangent vector and the 
differential of a mapping, the reader is referred to Chevalley’s book (2). 

Let M be a manifold. We denote by T(M) the set of all tangent vectors to M. 
For any two manifolds M and M’, we have a natural isomorphism 


T(M X M’) = T(M) X T(M’). 


Let G be a Lie group and ¢:G X G-—G be the mapping defining group 
operation : 
, o(s, s’) = s-s’, s,s’ € G. 


Consider the differential mapping’ 5¢: T(G K G) —~ T(G). T(G X G) being 
identified with T(G) X T(G), 5¢can be considered as a mapping of T(G) X T(G) 
onto 7(G) and defines a group operation in 7(G). The Lie group 7(G), 
obtained in this way, is called the tangent group to G. We have a natural 
imbedding of G into T7(G) and G is considered as a subgroup of T(G). The set 
of all tangent vectors to G at the unit, which we shall denote by 7,(G), is a 
normal subgroup of 7(G) and will be identified with the Lie algebra of GC. 

Suppose G acts, as a transformation group, on a manifold P on the right and 
let y: P X G — P be the mapping defining the transformation law. Then, the 
differential mapping 


by: T(P) X T(G) > T(P) 


defines 7(G) as a transformation group on 7(P) acting on the right. If P isa 
principal fibre bundle over M with group G and with projection x, then T(P) 
is a principal fibre bundle over 7(M) with group 7(G) and with projection 
ox. 


2. Soudure. Let B be a fibre bundle over base manifold M, with fibre F 
and with Lie structure group G. B is soudé (3) to M, if the following conditions 
are satisfied: 

(s.1) G acts on F transitively: then F can be identified with the homogeneous 
space G/G’, where G’ is the isotropic group at a point o of F. 

(s.2) dim F = dim M. 

(s.3) The structure group G of the bundle B can be reduced to G’: in other 
words, B admits a cross-section, which we shall denote by ¢. When B is con- 
sidered as the fibre bundle with structure group G’, it will be denoted by 
B’. 

(s.4) Two fibre bundles T7(M) and T,(B) over M, with group GL(n, R) 
(where » = dim M), are equivalent, where 7(M) is the space of all tangent 
vectors to M and T,(B) the space of all tangent vectors to F, at a(x), x 
running through M. 

Let P (resp. P’) be the principal fibre bundle associated to B (resp. B’). 


1Chevalley denotes the differential of ¢ by d¢. 


| 
| 
| 
| 




















ON CONNECTIONS OF CARTAN 147 


The structure group and the fibre of P (resp. P’) are G (resp. G’). P’ can be 
considered as a submanifold of P. 

Let g, q’ be the Lie algebras of G and G’ respectively. Take a vector subspace 
f of g such that 


(2.1) g=a +f, a’ VF = {o}. 


The tangent space 7,(F) to F at o can be identified with f; let p be the natural 
projection of G onto F = G/G’, then 6p maps 7,(G) onto T,(F), and since 
T.(G) and g are identified, 5p maps f onto 7,(F) isomorphically. 

Each element s of G’ induces a linear transformation of 7,(F), which we shall 
denote by L,. If f satisfies 
(2.2) ad(s)-{ C f sé G’, 


then L, corresponds to ad(s), when we identify 7,(F) with f. 

Now we shall consruct a 7,(F)-valued linear differential form @ on P’ satisfy- 
ing the following conditions: 

(6.1) If @ € T(P’) and @(a@) = 0, then 5x(a)is the zero vector, where = is the 
projection of P’ onto M. 


(0.2) 6(as) = L;'0(a) “@eET(P’), seG. 
(0.3) 6(us) = 0 u€P’, 8 € T(C’). 


Let @ be a tangent vector to P’ at u. The projection x: P’ — M induces the 
projection dx: T(P’) — T(M), and éx(@) is a vector tangent to M at r(u). 
As the bundle B is soudé to M, the vector 5x(u) can be identified with a vector 
tangent to F, at o(x), where x = r(u). We shall denote by @* this vector 
tangent to F, at (x). The element u € P’ is considered as a mapping of the 
standard fibre F onto F, such that u(o) = o(x), where o is the point of F 
which defined the isotropic group G’. The map wu induces the differential map 
bu of T(F) onto T(F,). The inverse image éu~'(a*) of a@* € T(F,) by du isa 
vector tangent to F at 0, which we denote by 6(@). Clearly @ is a linear differen- 
tial form on P’. If @(@) = 0, then @* is the zero vector; consequently 4x(@) is 
also the zero vector, which proves the property (6.1). 

Now we shall verify (0.2). 

We see that ds is a tangent vector to P’ at us. As x(a) = 5r(ids), we_have 
a* = (as)*. 

Then 

6(tis) = 5(us)~*-(ds)” = 5(us)*-a@ = ds*-5u" (a) = b57*0(a) = L7z"0(a). 
Finally we shall prove (0.3). For any u € P’ and § € T(G’), 5r(u3) is the zero 
vector. From the definition of @, it is clear that @(u3) = 0. 

Suppose our fibre bundle satisfies only the conditions (s.1)—(s.3). We shall 
prove that, if there exists a 7,(F)-valued linear differential form @ on P’, 
which possesses the properties (0.1)—(0.3), then the bundle B satisfies also the 
condition (s.4). 

Let be a tangent vector to M at x and @ be a tangent vector to P’ at u 
such that 

bx(a) = Z. 











148 SHOSHICHI KOBAYASHI 


Then x(u) = x. As 6(a) is an element of 7T,(F) and u is a map of F onto F, 
such that u(o) = o(x), the image 5u(@(a%)) of @(a) by the differential of u isa 
tangent vector to F, at (x). Now we shall show that 6u(@(a@)) depends only on 
# and is independent of the choice of @ such that x(a) = #. If @ is a tangent 
vector to P at the same point u such that r(a@’) = Z, from the property (6.3), 
6(a@’ — @) = 0; hence 


6(a’) = 0(a), 6u(0(a’)) = 5u(0(a)). 
If @ = ds for some s € G’, then @ is tangent to P’ at us and 
(a) = L;"0(a). 
Hence 
5(us) O(a’) = du-ds-L;'0(a) = 5u-0(a). 
This completes the proof, because, for any @ € T(P’) such that 
bx(a’) = dr(a), 


there is an element s € G’ such that @’s is tangent at the same point as u 
and 
bx (a's) = br(a). 


If the vector subspace f of g satisfies (2.1) and (2.2), it can be identified with 
T.(F). Therefore @ is considered as an f-valued linear differential form and the 
property (0.2) is replaced by 


(0.2’) 6(us) = s—'0(a)s “w@eT(P’), séG. 


A homogeneous space F = G/G’ is called weakly reductive (8), if there is a 
vector subspace f of g satisfying (2.1) and (2.2). 


THEOREM 1. A fibre bundle B satisfying the condition (s.1)—(s.3) is soudé to 
M, if and only if there exists a T,(F)-valued linear differential form @ on P’ 
possessing the properties (0.1)—(0.3). If the homogeneous space F =G/G’ is weakly 
reductive then @ is considered as an f-valued linear differential form and the 
property (0.2) is replaced by (0.2). 


Remarks on weakly reductive homogeneous spaces. In either of the 
following cases, the homogeneous space F is weakly reductive: 

(1) G’ is compact, 

(2) G’ is semi-simple and connected, 

(3) G’ is discrete. 

If F is an affine space (resp. Euclidean space) and G is the affine transforma- 
tion group (resp. the group of motion) of F, then F is weakly reductive. 

If F = G/G’ is weakly reductive, then there exists an affine connection 
on F invariant by G (8). Therefore the linear isotropic group G’ is isomorphic 
to the isotropic group G. If F is a real projective space and G is the projective 
transformation group of F, then F = G/G’ is not weakly reductive. 























ON CONNECTIONS OF CARTAN 149 


3. Connections of Cartan. We shall use the same notations as in §2. 

An infinitesimal connection in P is defined by a g-valued linear differential 
form 6 on P with 
(€.1) @(us) = s— 8 u€EP, 8s€T,(G), 
(€.2) @(is) = s~'6(i)s u€ T(P), s € G: 
The meaning of s~'3 and s~'&(a@)s is explained in §1. 

Let w be the restriction of the form 6 on P’. Then w is also a g-valued linear 
differential form such that 


(c.1) w(us) = s— 8 u€ P’, s € T,(G’), 
(c.2) w(is) = s~'w(i)s u€ T(P’), s€G. 


The form w does not give a connection in P’, because it is not g’-valued. 
It is clear that, if w is a g-valued linear differential form on P’ satisfying the 
conditions (c.1) and (c.2), then it is the restriction of a unique differential 
form & on P satisfying the conditions (c.1) and (c.2). 

An infinitesimal connection in P defined by @ is called a connection of Cartan 
(3), if the restricted form w satisfies the following condition: 


(c.3) If @ € T(P’) and w(@) = 0, then @ is the zero vector. This implies that 
w defines an absolute parallelism on P’. 


Suppose the homogeneous space F = G/G’ is weakly reductive, and let w’ 
be a qg’-valued linear differential form on P’, which defines an infinitesimal 
connection in P’. The form w’ satisfies the same conditions (c.1) and (c.2) as 
the form w; the difference is that the one is g’-valued and the other is g-valued. 
Let @ be the f-valued linear differential form on P’ in Theorem 1. We shall 
show that the sum @ + w’ satisfies the conditions (c.1)—(c.3). Put 


(3.1) wo=60+ w’. 
Then 
(3.2) w(us) = 0(us) + w’ (ud) u€ PP’, 8€ T,(GC’). 
From (6.3), we obtain 
(3.3) w(us) = w’ (ud) u€ P’, 8€ T,(G’). 
As w’ is a form of connection in P’, we have 
(3.4) w’ (us) = s—8, 
which proves that w satisfies (c.1). 
We have 
(3.5) w(tis) = 0(ius) + w’ (as) ac T(P’), s€G’. 


Since w’ is a form of connection in P’, we have 


(3.6) w’ (as) = s~'w’(a)s ae T(P’), s€G’. 








150 SHOSHICHI KOBAYASHI 


From (6@.2’) and (3.6), it follows that 


(3.7) w(ts) = s~'w(a)s uc T(P’), 
Suppose 

(3.8) w(ud) = 0, 

which implies 

(3.9) O(a) = 0, w' (a) = 0. 


s€G’. 


The first means that 5r(@) is the zero vector, or that the vector @ is vertical 
in the sense of Ambrose (1), and the latter implies that the vector @ is hori- 
zontal (1) with respect to the connection in P’ defined by w’. Therefore @ is the 


zero vector. 
We have proved the following 


LEMMA 1. Suppose F = G/G' is weakly reductive. If w’ is a g'-valued linear 
differential form on P’ defining a connection in P’ and @ is an f-valued linear 
differential form on P’ satisfying the conditions (0.1), (0.1'), (0.3), then the form 
w = 0+ w’ defines a connection of Cartan in P; that is, w is the restriction of 


a form & on P defining a connection of Cartan in P. 


Now, suppose that w is a form on P” satisfying the conditions (c.1)—(c.3). 


Let @(resp. w’) be the f (resp. g’) component of w: 


(3.10) wo=0+’, 
(3.11) O(a) Ef, w'(a) € Q’ 


ae T(P’). 


We shall prove that @ satisfies the conditions (6.1), (@.2’), (6.3) and that w’ 


defines a connection in P’. 


Suppose 
(3.12) 6(az) = 0. 
Then 
(3.13) w(%) = w'(u) € Qg’. 
Take an element § € 7,(G’) such that 
(3.14) s = — (a). 


(T,(G’) was identified with the Lie algebra g’ of G’.) Then? 
(3.15) w(a3) = w(@) + 3 = 0. 

From (c.3), it follows that @ is the zero vector; hence 
(3.16) bx(a) = br (as) = 0, 

which proves that @ satisfies (0.1). 


*The conditions (c.1) and (c.2) are equivalent to the following single condition: 


w(a5) = s" 5 +s! w(a)s, because w(#5) = w(u5) + w(ads). Putting s = e, we obtain (3.15). 














ON CONNECTIONS OF CARTAN 151 


Since w(uS) = s~'8 is contained in g’, 6(u3) vanishes for any u € P’ and 
5 € T,(G’); hence 
(3.17) w(us) = w’ (u3) u€P’, &8€ T,(G’). 


Therefore @ satisfies (0.3) and w’ satisfies (c.1). We have 
(3.18) w(ds) = s~'(6(a) + w’(d))s = s—O(a)s+s—w'(@)s Ge T(P’),s€ GC’. 


As the homogeneous space F is weakly reductive, s~'@(a)s is contained in f. 
Comparing (3.18) with the following equality 


(3.19) w(is) = 0(us) + w' (as), 
we obtain 
(3.20) 6(us) = s—'0(a)s, w' (tis) = s~'w’(a)s. 


Therefore @ satisfies (@.2’) and w’ satisfies (c.2). 


Lemma 2. If a g-valued linear differential form won P’ satisfies the conditions 
(c.1)—(c.3), then w is the direct sum of an {-valued form 0 satisfying (0.1), (6.2’), 
(0.3) and a form w' defining an infinitesimal connection in P’. 


Theorem 1 justifies the following definition: An f-valued linear differential 
form @ is called a form of ‘‘soudure,”’ if @ satisfies the conditions (@.1)—(@.3). 


THEOREM 2. Suppose F = G/G’ is weakly reductive. Then, to every pair of 
a soudure of B and a connection in P’, there corresponds a unique connection of 
Cartan in P. Conversely, to each connection of Cartan in P, there corresponds a 
unique pair of a soudure of B and a connection in P’. If we denote by 0, «', w 
a form of soudure, a form of connection in P’, a form (restricted on P’) of Cartan 
connection in P respectively, then the correspondence is given by w = 0 + w’. 


The Theorem follows immediately from Lemmas 1 and 2. 


4. Structure equations. Let w be a form on P defining a connection of 
Cartan in P. Then we have 
(4.1) da = — $a, d] + &, 


where @ is the curvature form (1; 3). 
Consider the restricted form w on P’. Then we have 


(4.2) dw = — }[w, w)] + Q, 


where @ is the restriction of @ on P’. 
Assuming the homogeneous space F = G/G’ is weakly reductive, we sub- 
stitute w = 6+ w’ in (4.2) and we obtain 


(4.3) 6 + de’ = — §([0, w’] + [w’, 0] + [w’, w’] + (0, 6]) + @. 


We decompose [@, @] and Q into two components as follows: 
6,0) = [0,0 6,6),, @=2,.4+92,, 
(0, 6] = [ I; + [ I y +8, 











152 SHOSHICHI KOBAYASHI 


where, for any @, @ € T(P’) tangent at the same point, 


[0(@), o(a)), Ef, l(a), 0(@")) Eg’ 


O(a, a) € f, O(a, @) € a’. 
Then we obtain from (4.3) the following equalities: 
(4.4) d6 = — }([0, w’] + [w’, 6] + [6, 6), + a, . 
(4.5) du’ = — 3([w’, w’] + (6, 4} ) +o... 
Putting 
(4.6) o= 2, — 36,6), 


we call © the torsion form of the connection of Cartan. As the curvature form 
2’ of the connection in P’ defined by w’ is given by 


(4.7) dw’ = — 3[w’, w’] + 2, 
we obtain from (4.5) the following equality. 

4.8 Y =2, — 3[6, 6] ,. 
(4.8) . 3[ I 


Now we obtain the following 
THEOREM 3. Let 


be the torsion form and the curvature form of the connection in P’ defined by 
w’. Then we have 


d0 = — 3((0, w’] + [w’, 6]) + 8, 
Y = a. - 3([8, 6),): 


(1) If the homogeneous space F = G/G’ satisfies furthermore the condition 


f,flcoe’, 
then we have 
6=Q2., oY =2 , — $[6, 6). 
f . 3(6, 6] 
(2) If the homogeneous space F = G/G’ satisfies the stronger condition 
[f, f] = 9, 


then we have 


Remarks. A homogeneous space F is called symmetric, if it satisfies the 
assumption of (1) in Theorem 3. On such a space F, there exists (8) an affine 


; 
| 
' 

















ON CONNECTIONS OF CARTAN 153 


symmetric connection invariant under G. If F is an affine space and G is the 
affine transformation group, then F satisfies the assumption of (2) in Theorem 
3. In this case, a connection in P’ is called a linear connection (because the 
structure group G’ is the general linear group). If F is an affine space and B 
is the tangent bundle 7(M), then there is a canonical soudure in B. If we take 
always this canonical soudure, then Theorem 2 says that, to each linear 
connection in P’, there corresponds a unique connection of Cartan in P, 
which will be called an affine connection. Part (2) of Theorem 3 implies that the 
restriction on P’ of the curvature form of an affine connection is the sum of 
the torsion form and the curvature form of the corresponding linear connection 
(which is usually called the curvature form of the affine connection). 

It will not be useless to point out that the holonomy group of thelinear 
connection corresponding to an affine connection is usually called the homo- 
geneous holonomy group of the affine connection. If the torsion form of an 
affine connection vanishes, then the form Q; vanishes also ((2) of Theorem 3). 
But this does not imply that the form Qf, f-component of the curvature form 
of the affine connection (of which O25 is the restriction on P’) vanishes. That is 
why the holonomy group of an affine connection without torsion contains 
the translation part (7). And we shall see easily that, if the non-homogeneous 
holonomy group coincides with the homogeneous holonomy group, then our 
affine connection is flat. 


5. Invariant connections of Cartan. Consider a homogeneous space 
F = G/G’. G is considered as a principal fibre bundle over the base manifold 
F, with structure group G’ and with the natural projection (9) 


r:GoF=G/G'. 


Let P be the fibre bundle with fibre G (on which G’ acts on the left )associated 
to the principal fibre bundle G described above. P is defined as follows. We 
shall say two elements (51, 52) and (s3, s4) of G X G are equivalent if there is an 
element s’ of G’ such that 


(5.1) $18’ = Sz, s’—l-So = Sq. 


P is the set of these equivalence classes with the natural structure of fibre 
bundle; the projection of P onto the base manifold F is induced from the 
mapping of G X G onto F: 


(5.2) (si, $2) — (51), 


where x is the natural projection of G onto F. The operation of G on G X G 
on the right given by | 


(5.3) (Si, S2) S$ = (Si, S25) 


induces the operation of G on P on the right. In this way, P can be considered 
as a principal fibre bundle with group G. 











154 SH6SHICHI KOBAYASHI 


The injection of G into G X G such that s — (s, e), where e is the unit of G, 
defines the injection of G into P. The submanifold G of P is stable under the 
operation of G’ on the right; that is, if « € P belongs to the submanifold G, 
then us belongs to G for any s € G’. 


LemMMA 3. The princpal fibre bundle P is trivial; P is the direct product of 
the base space F and the structure group G. 

Proof. Define a mapping j of G X G onto F X G as follows: 
(5.4) G(S1, S2) = (w(s1), S152). 
Then j induces a mapping j° of P onto F X G, which commutes obviously 


with the operation of G on the right, proving the Lemma. 


As P is trivial, the fibre bundle B with fibre F associated to the principal 
fibre bundle P is also trivial: 


(5.5) B=FxXF. 


Lemma 4. The fibre bundle B with fibre F associated to P is soudé (3) to the 
base manifold F. 


Proof. The conditions (s.1) and (s.2) of §1 are apparently satisfied. We take 
the cross-section ¢ defined as follows: 


(5.6) F>x— (x,x)€ FX F=B. 
The identification of T(F) with T,(B) is given by 
(5.7) T(F) 3 — (x, #) € Tr(B). 


If we reduce the structure group G of P to G’, we obtain the principal fibre 
bundle G, from which we started. 


The fibre bundle G corresponds to the fibre bundle P’ in §2. Therefore 
we denote by P’ the fibre bundle G. 


A connection of Cartan in P is given by a g-valued linear differential form 
w on P’(=G) satisfying the conditions (c.1)—(c.3). As P’ is a group space G, 
G acts on P’ on the left as well as on the right. We shall define a left invariant 
connection of Cartan; that is, we shall define a g-valued form w on P’ such that 
(5.8) w(si) = w(a) ae T(P’), s€G. 
It is clear that such a form w is unique and must be defined by 
(5.9) w(us) = § u€P’, 8€ T,(G). 


In this case the structure equation of E. Cartan reduces to the equation of 
Maurer-Cartan: 


(5.10) dw = — $[a, w]. 








ly 


ke 





a 


ON CONNECTIONS OF CARTAN 155 


THEOREM 4. There is a unique left invariant connection of Cartan in P. 

It is given by a g-valued form w on P’(=G) defined as follows: 
w(us) = & u€ P’, 8 € T,(G). 

The curvature form of the connection vanishes on P’, hence on P, too. 

Proof. From (5.10), it follows that the curvature form vanishes on P’. 
Let @ be the curvature form. Then we have 
(5.11) Q(ds, a's) = s— O(a, a’)s i, a’ € T,(P), s€G. 
Since 2 vanishes on P’, it follows easily from (5.11) that @ vanishes on P. 

Suppose the homogeneous space F = G/G’ is weakly reductive. Let 
(5.12) w= 6+ w’ 
be the decomposition of the form w into an f-valued form @ and into a g’-valued 
form w’. The g’-valued form w’ defines a connection in the principal fibre 
bundle P’(=G) with group G’. Let © be the torsion form of the connection of 


Cartan defined by w and 9’ the curvature form of the connection in P’ defined 
by w’. From Theorems 3 and 4, it follows that 


(5.13) © = — 310,6),, 
(5.14) Y = — 3/6, a 

THEOREM 5. Let w be the g-valued form on P’'(=G) defining the invariant 
connection of Cartan in P. Suppose the homogeneous space F = G/G' is weakly 
reductive and let w = 0+ w' be the decomposition corresponding to a decomposition 


of the Lie algebra g satisfying (2.2). Then 
(1) The torsion form of the connection of Cartan defined by w is given by 


8 = — 3(6, 6}; - 
(2) The curvature form of the connection in P’ defined by w' is given by 
oY = — 4(6, 6] ,. 
4 l 


(3) The torsion form vanishes, if and only if the homogeneous space F is sym- 
metric; that ts, 


lf, fl Sa’. 


(4) The restricted holonomy group of the connection defined by w' is an invariant 
subgroup of the connected component of the unit of G’. And the Lie algebra of the 
holonomy group is the linear closure of 


Ufa, fal, i Sisfe € f}. 


Proof. We have only to prove (3) and (4). From (6.1) it follows that, for 
any f1, fe and u € P’, there are a, a € T,(P’) such that 


(5.15) 6(a,) = fi, O(t2) = fo. 








156 SHOSHICHI KOBAYASHI 


Therefore, in order that the homogeneous space F be symmetric, it is necessary 
that the torsion form vanishes. It is evident that, if F is symmetric, the torsion 
form vanishes. 

Now we shall prove (4). Take an arbitrary point u» in P’ and let P® be the 
set of all points in P’ which can be joined to u» by horizontal curves (1) (with 
respect to the connection defined by w’). In other words, we reduce the structure 
group of P’ to the holonomy group of the connection defined by w’, and we 
obtain the principal fibre bundle P° whose structure group is the holonomy 
group. Then the Lie algebra of the holonomy group is the linear closure of (1). 


(5.16) {Q’ (a, de); di, de € T,(P°), wu running through P*}, 
which is equal to 
(5.17) {[0(d1), 0(a2)) iy, 2 € T,(P°)}. 


Since, for any fi, fe € f and uw € P®*, there are a, a € T,(P°) satisfying 
(5.15), the set (5.17) is equal to the set 


(5.18) Ufa, fal, Sufe € f}. 


Using the Jacobi’s identity, we see easily that the linear closure of the set 
(5.15) is an ideal of the Lie algebra g’ of G’. This completes the proof of (4). 


Remark. The results in this section are closely related to those of Nomizu 
on invariant affine connections (8). The relation between them will be discussed 
in another paper. 


REFERENCES 


1. W. Ambrose and I. M. Singer, A theorem on holonomy, Trans. Amer. Math. Soc. 76 (1953), 
428-443. 

2. C. Chevalley, Theory of Lie groups (Princeton, 1946). 

3. C. Ehresmann, Les connexions infinitésimales dans un espace fibré différentiable, Colloque de 
Topologie (Bruxelles, 1950). 

4. S. Kobayashi, Le groupe des transformations qui laissent invariant le parallélisme, Colloque de 
Topologie (Strasbourg, 1954). 

, Espaces a connexion de Cartan complets, Proc. Jap. Acad. 30 (1954), 709-710. 

, Espaces a connexions affines et riemanniennes symétriques, Nagoya Math. J., 9 
(1955) 25-27. 

7. S. Kobayashi and K. Nomizu, Curvature, torsion and holonomy (unpublished). 

8. K. Nomizu, [nvariant affine connections on homogeneous spaces, Amer. J. Math. 76 (1954), 
33-65. 

9%. N. Steenrod, The topology of fibre bundles (Princeton, 1951). 








University of Washington, Seattle 




















A GENERALIZATION OF AN INEQUALITY 
OF HARDY AND LITTLEWOOD 


K. T. SMITH 


1. Introduction. A well-known inequality of Hardy-Littlewood reads as 
follows (4): if p > 1 and f > 0, then 


Sieeras <A f serves, 


where f(x) is defined as the supremum of the numbers 


z+0e 


f(t) dt; 





vu+ud: 


the constant depends on p only. The statement obtained by putting p = 1 
is false; its substitute reads: 


fie) dx <A f se) dx +B fe log* f(x) dx + «; 


the constants depend on « but not on f. The Hardy-Littlewood inequality has 
had several important applications: to function theory, harmonic functions, 
Fourier series, and the strong differentiability of multiple integrals—to 
mention those with which the author is acquainted. The application to har- 
monic functions is the following (4): 


Let u(r, &) be a non-negative harmonic function in the unit circle, and for each 
@ define 
ii(@) = sup u(r, ¢). 
0<r<l 


Then if p > 1, 


Qe Qe or 
Jf ayes <Asup fur, ode= 4 f u(oyas, 
0 0<r<l 0 0 


where u(d) = u(1, @) is the boundary function for u and where A is a constant 
depending on p only. 


Since the original appearance of the inequality there have been a number of 
generalizations. It was formulated for n-dimensional space by Wiener (14) 
and used to prove dominated individual ergodic theorems. The n-dimensional 
case was used also by Calder6én and Zygmund (3) to prove dominated point- 
wise convergence of singular integrals. It was formulated for certain types of 


Received February 28, 1955. Work performed under contract with Office of Naval Research 
N58304. 


157 








158 K. T. SMITH 


locally compact topological groups with Haar measure by Calderén (2) and 
used again to prove ergodic theorems. It was formulated in a weaker version 
for metric spaces with an outer measure of the type considered below by 
Rauch (8) and used to prove ergodic theorems and theorems about analytic 
functions of several complex variables. The latter are described after Theorem 
4 below.' 

The object of this note is to give the inequality a general form valid for 
certain types of measures on metric spaces and to give applications of the 
general form of the inequality to harmonic functions, subharmonic functions, 
and strong differentiability of multiple integrals. 


2. The inequality. Most of the arguments are based upon a simple 
covering theorem which appears implicitly in Banach’s proof of the Vitali 
covering theorem. It can be stated as follows.* 


LemMA 1. Let S be a family of spheres in a metric space. If S satisfies the 
conditions (i) and (ii) below, then it contains a disjoint sequence {S(xp_, T,)} 
such that 


} S(x,r) C > Sx, a). 
n=l 


S(z.1r) SG 


The conditions are as follows: 
(i) There is a number R such that for every S(x,r)€ S,0O<r< R. 
(ii) If {S(x,, 7,)} is any disjoint sequence in ©, then r, — 0. 


By using the notation S(x,r) for the sphere with center x and radius r we 
agree tacitly that a sphere is an object determined by a center and a radius. 
In the applications of Lemma | it is the set of points included in the sphere 
which is important. In order to apply the Lemma to a family of sets each of 
which is a sphere with respect to several centers and several radii it will be 
necessary to demonstrate the possibility of choosing for each set one center and 
one radius in such a way that the hypotheses of the Lemma are satisfied. This 
arrangement has been picked in order to avoid an unnecessary hypothesis 
excluding isolated points in the metric space. When the space consists solely 
of a finite number of isolated points the inequality becomes an inequality on 
finite sums of some interest in itself. It is to this special case that the greater 
part of the proof of Hardy and Littlewood is devoted. 

We shall consider a metric space B on which there is a regular outer measure 
subject to the conditions which follow. 


‘Until told by the referee, the author was not aware of the work of Wiener, Calderén, and 
Rauch. Recently Rauch has supplemented his note (8) with a paper (9) which will be found 
elsewhere in this journal; in the latter he obtains the full Hardy-Littlewood inequality by a 
method of Wiener, but with less precise constants than those in the theorems below. 

*Wiener, Calderén, and Rauch use similar covering theorems. A proof is given in (1). 














ire 


ind 
ind 
ya 








AN INEQUALITY OF HARDY AND LITTLEWOOD 159 


If E is any set, || is its measure and 4(£) is its diameter. 

(a) Each sphere is measurable and has finite measure. 

(b) There is a constant K such that |\S(x, 4r)| < K\|S(x,r)| for every closed 
sphere S(x, r). 

(c) If {S,} is @ sequence of closed spheres such that |\S,| — 0, then* 5(S,) — 0. 

(d) If {S,} is a@ sequence of closed spheres such that 5(S,) — @, then |S,| + @. 

It is known that these conditions are sufficient to ensure that every Borel 
set in B is measurable. 

When f is a non-negative measurable function belonging to some class 
L?, p > 1, on B we make use of the following notations: 

(i) f(x) is the supremum of the averages of f over all the closed spheres 
centered at x; that is, 


lx) = sup ry f $0) ay, 


the supremum being taken over all closed spheres S centered at x. 

(ii) f* (2), defined for ¢ real and >0, is the non-increasing equimeasurable 
rearrangement of f. (that is, |E,{f*(¢) > a]| = |E,{ f(x) > a]| for alla > 0.) 
It is well known that for any measurable set E C B, 


Sj@ae< ; f*() dt, 


and equality holds if E = B. 


(ii) Bt) = + f f(s) as. 


8, is a continuous non-increasing function, strictly decreasing except possibly 
in an interval beginning with 0 where it can be constant. 

(iv) 8’ is the (upper semi-continuous) inverse function to 6;; ifs > sup 8,(t), 
then‘ B/(s) = 0. 


Lemma 2. If f > 0 belongs to L’, p > 1, then f is lower semi-continuous. 
Consequently f is measurable. 


Proof. Suppose that x, — x. If S is an arbitrary closed sphere with center x 
and radius r, let S, be the closed sphere with center x, and radius r + d(x,, x). 
Since S = lim S,, it follows both that |S| = lim|S,| and that 


J, F0) dy = tim Jf) ay 


*The author's original condition was somewhat less general. The change to this condition 
and a modification in the proof of Theorem 1 required by the change were suggested by N. 
Aronszajn. 

‘The properties of f*, 8;, and 8’ are described briefly in Calderén and Zygmund (3). 











160 K. T. SMITH 


and hence that 


u ' 1 . . 
{S| Jo eed J,10) dy < lim inf f(x,). 


THEOREM 1. If f > 0 belongs to L’, p > 1, then f also belongs to L? and 


fi (x)’'dx < K (2) f f(x)’dx 


where K is the constant of hypothesis (b) on B. 


Proof. For the first part of the argument we suppose only p > 1. We begin 
by noting that if E is any measurable set of positive measure over which the 
average of f is >i > 0, then by Hdélder’s inequality 


1 1 PO a 
t< qh reas < pal frerec}” 


1 
El <5 fs(eVae. 
That is, |E| is bounded by a constant independent of E. If {Z,} is a disjoint 
sequence of sets over which the average of f is >/, then 


E= > 2, 


n=l 

is also a set over which the average of f is >t. Consequently }-|E,| = |E| < @, 
so that |E,| — 0. 

Now, if ¢ > 0, let B, denote the set of points x such that f(x) > ¢. For each 
x € B,, t fixed, let S, be a closed sphere centered at x over which the average 
of f is >t. Furthermore, choose S, with positive measure and so that it admits a 
non-zero radius. Let S be the family of these spheres. It will be shown that it is 
possible to choose for each x € B,a radius r(x) such that S, = S(x, r(x)) and 
such that with this choice of centers and radii for the spheres in S, © satisfies 
the conditions of Lemma 1. 

For each x € B, let ro(x) be the infimum and 1r;(x) the supremum of the 
numbers r such that S, = S(x,r). If ro(x) > |S,|, then take r(x) = ro(x). 
If ro(x) < |S,|, then take 


r(x) = min| £208) + (3) ss]. 


Clearly r(x) # 0 and S, = S(x, r(x)). From the first paragraph of the proof it 
follows that the numbers |S,| for S, € S are bounded, and then from condition 
(d) on B it follows that the numbers 4(S,) for S,€ GS are bounded. Now, 
either r(x) = ro(x) < 6(S,) or r(x) < |S,|, so the numbers r(x) for S, € S are 
bounded; and condition (i) of Lemma 1 is verified. If 


{S:,} 


so that 





— 





rin 


he 


nt 





——EE | 


AN INEQUALITY OF HARDY AND LITTLEWOOD 161 


is a disjoint sequence in ©, then again from the first paragraph it follows 
that 


| Sz.| = 0, 
and from condition (c) on B it follows that 
5( Sz.) sis 0. 


Thus r(x,) — 0, and condition (ii) of Lemma 1 is verified. 
Having verified (i) and (ii), we can apply Lemma 1 to extract from S a 
disjoint sequence {S(x,,7,)} such that 


BC DD _Slx,7)C Z, Sle, dre). 
s S n=l 


(z.r)e 


Then 
IB,| < > |S(xn, 4t2)| < Ky |S(xa, tn)| = K\E|, 


where K is the constant in hypothesis (b), and E = >> S(x,,7,). As before, 
the average of f over E is >t. (Therefore the first paragraph provides a bound 
for |E|, but this bound is not sharp enough.) We have however, 


1 f 1 
—- d = t)hdt = E}), 
<p slod <a J, Poa = sxlE) 
and by inverting 6,, |E| < 8/(#). Finally, therefore, |B,| < KA’(#). 


Now let » > 1. In the following chain of inequalities we use the fact 
that 


lim sB,(s)? = lim s8,(s)” = 0, 
800 340 
and we use the substitution ¢ = 8,(s). We have® 


Sieve = Jee "B,|dt <K as (t) dt 


--K J saaasy= K f "B(s'ds = K fz fr (t) att as 
<(5£5) Sirora=x Gt) fiseree 


THEOREM 2. If f is non-negative and measurable, and if f(x) log*tf(x) is 
integrable, then for any measurable set E, 


Sie dx < 2K fre) log* f(x) dx + (az + 2) iE, 


5This concluding calculation can be found in Calder6n and Zygmund (3). The last inequality 
in the chain is a well-known inequality of Hardy. 








162 K. T. SMITH 


Proof. It is well known that if f log* f is integrable, then f itself is integrable 
over every set of finite measure. This is all that is necessary to the formation 
of the function f. It does not guarantee the existence of f*, however, so we 
write f = g + h, where g(x) = f(x) if f(x) < 2 and g(x) = 0 otherwise. 

It is clear that {29(x) dx < 2|E|, so if it can be proved that 
(2.1) f ke) dx < 2K fue) log*h(x) dx + Elz, 

E 
then we will have 


. af + 4K 
J je dx < Jae dx + J ke dx < 2K fre) log’ f(x) dx + (= + 2) iE, 


Now, & is in fact an integrable function, so the proof will be complete if we 
prove (2.1) for integrable functions. 

Let us call the integrable function f, rather than hk, so that the notations 
used in Theorem 1 will be appropriate. Let B, = E(\ B, Then |B,'| < |E| 
and, as was proved in Theorem 1, |B,'| < |B,| < K8’(t). Hence 


Ji@a= [wia< f"wia+K forwa 


for any to, while also, for tp = 8,(|E)), 


~ 1zBl |B! 
Siewa=—- fo sdpis) = — 1e1a,\e)) + BAS) ds. 


Furthermore, 


lzBl slg P : ‘ 
Bis) ds = fi a i (s’) ds 


[Bl : zl [zl ; 
= j. f*(s’) log El ay <2f f*(s’) log *f*(s’) as' +2 f (2!) ds’ 


s 


<2 Jf fe log*f(x) dx + : El. 


Therefore 


J Fede < 1 K) [B| 8(1B)) + 2Kf (2) log pte) dx + (2, 
Since necessarily K > 1, (2.1) follows. 

Once the estimate for |B,| is obtained, the evaluation is almost identical 
with that given in Calder6n and Zygmund (3) in the proof of Theorem 2. 
One of the inequalities used in the chain is that of W. H. Young, namely 
ab < alog a +e". 


THEOREM 3. If f is non-negative and integrable, and 0 < « < 1, then for every 
measurable set E, 


J iota < (4 )ieix-4 fre) ao 








AN INEQUALITY OF HARDY AND LITTLEWOOD 163 


Proof. The proof is the same as the last part of the proof of Theorem 3 of 
Calder6n and Zygmund (3). Use must be made, of course, of our previous 
estimate for |B,). 


3. Applications—Harmonic functions. We propose to apply the 
inequality to the case where B is a smooth surface bounding a bounded domain 
D in Euclidean n-dimensional space R", n >3. The explicit smoothness assump- 
tions are as follows.*® 


(a) Bisa C' surface; that is, each point of B has an n-dimensional neighborhood 
V which can be mapped in 1-1 fashion on an n-dimensional cube by a transforma- 
tion T such that T and T— are C' transformations and such that T(B \ V) is 
the intersection of the cube with one of the coordinate hyperplanes. 

(b) B ts of bounded curvature in the large sense, that is, if a(x, y) denotes the 
angle between the exterior normals at x and y, then 


as sin }a(x, y) de ©, 


u 
po we aie —y 


The metric in B is its metric as a subset of Euclidean space. The measure on 
B is the area measure, definable in the classical manner because of the smooth- 
ness conditions. 

We shall use capital letters P, Q, etc. to designate points in the interior of 
D, and small letters x, y, etc., to designate points on the boundary B. Each 
point P at distance less than p» from B lies on a unique line segment of length 
less than pp and normal to B. We write xp for the point at which this segment 
meets B. Proceeding from the opposite direction, we write P(p, x) for the point 
at distance p from x measured along the interior normal through x. Finally, 
we write j§, for the class of functions f(P) harmonic in D and such that 


sup } |f[P(o,x)]|’dx < ~. 
0<p<pop VB 


(Note that f[P(p, x«)] is the restriction of f(P) to the surface parallel to B at 
distance p.) 

It is known that if f € §, then f(P) has a limit f(x) as P + x non-tangen- 
tially (13) for almost every point x € B. The so defined function f(x), which 
belongs to L? on B, is called the boundary function of f. When p > 1, the 
functions {[P(p, x)] converge in mean of order p to the boundary function as 
p — 0, and f is the Poisson integral of the boundary function.’ We shall make 


*A forthcoming note by Aronszajn will contain proofs of all the needed properties of such 
surfaces. The object of his note is to exhibit the best possible constants in all cases. Here we 
do not need the best constants, but only the qualitative sense of the properties, and for the 
most part this is classical information. 

7A proof of the mean convergence can be found in (1). A related result concerning the 
constant surfaces of the Green's function (for fixed pole) rather than the parallel surfaces is 
proved by Privaloff and Kouznetzoff (7). 











164 K. T. SMITH 


use of an inequality between the values of f in D and the mean values of the 
boundary function over spheres in B. 


Mean value inequality. Jf the harmonic function f(P) is the Poisson 
integral of its boundary function f(x), and if f(x) denotes the supremum of the 
averages of |f(y)| over the closed spheres in B centered at x, then for every P within 


distance po of B we have |f(P)| < A f(x p). The constant A depends only on po and 


on the dimension® n. : 


THEOREM 4. For each f in §, let 
F(x) = sup | f[P(o, x)]|. 
0<p<po 


For p > 1, the following assertion holds: if f belongs to %», then f belongs to L? 
on B, and 


J jeerax < x(52,)'« lf) Pd = lim x(=#,)'a Jinee. x)] Pax. 


p 0 

Proof. The metric space B and its measure are obviously of the type con- 
sidered in the second section, so Theorem 4 follows directly from Theorem 1 
and the mean-value inequality. 

In the case of the circle in the plane this is the theorem of Hardy and Little- 
wood quoted in the introduction. The related theorem of Rauch (8; 9) on 
analytic functions is as follows: 

If D 1s the sphere, if f is complex valued, and if n is even and the variables can 
be paired so that f is an analytic function of n/2 complex variables, then the 
assertion of Theorem 4 holds for any exponent p > 0. 

Rauch’s theorem is obtained from the special case of exponent 2 in Theorem 
6 below by putting s(P) = |f(P)|*. 


THEOREM 5. If f belongs to §1, then for each «,0 < « < 1, 


J Jee < o, 


Proof. The function f ¢ §: has a boundary measure » in terms of which it 
can be represented as a Poisson-Stieltjes integral. The mean value inequality 
is valid here in a suitably modified form; namely, |f(P)| is less than or equal 
to a constant times the upper bound of the quotients »(C)/|C| taken over the 
closed spheres in B centered at xp. We do not give more of the proof for it is 
essentially the same as the proof of Theorem 7 below on subharmonic functions. 


Remark. The proof of Theorem 5 is not based on Theorem 3, for f does not 
necessarily have a boundary function of which it is the Poisson integral. It is 
plain that if f does have such a boundary function, then certain conclusions 
can be drawn from Theorems 2 and 3. It does not seem necessary to state 
the conclusions. 


*This is a special case of an inequality which is proved in Aronszajn and Smith (1). This 
special case was obtained for the circle in the plane by Hardy and Littlewood (4). 








L? 


x. 





AN INEQUALITY OF HARDY AND LITTLEWOOD 165 


Subharmonic functions. 
THEOREM 6. Let s(P) be a non-negative subharmonic function in D, and let 


$(x) = sup s[P(p, x)]. 
0< 


e<Po 


For p > 1, we have*® 


f seeyrae < K( 2 par sup s[P(p, x) dx. 
K yor 0<p<op YB 

Proof. We suppose that the right side is finite. For each p, 0 < p < po, 
we write B, for the set of points in D at distance p from B, and D, for the 
sub-domain of D bounded by B,. We write f, for the harmonic function in 
D, with the same boundary values’® as s. It is well known that the harmonic 
function f, converge increasingly as p—>0 to a harmonic function f € §, 
which dominates s (10). In addition the functions s[P(p, x)] converge weakly 
in L? on B as p — 0 to the boundary function f(x) for f. Theorem 6 follows from 
Theorem 4 and the lower semi-continuity of the norm in L’ with respect to 
weak convergence. 

By using the results of F. Riesz on the representation of subharmonic 
functions by potentials we can prove similar theorems for subharmonic 
functions which are not necessarily non-negative. For the sake of simplicity 
we confine the discussion to the sphere, though the results are equally valid 
for the more general domains of the last paragraph, as the proofs will show. 

The theorem of F. Riesz states that if s(P) is a subharmonic function in 
the domain D, then a necessary and sufficient condition that s(P) be the sum 
of a harmonic function and the Green's potential of a negative Borel measure 
on D is that s(P) be bounded above in D by a harmonic function; the harmonic 
function which figures in the representation is the smallest harmonic function 
which bounds s(P) above (10). This function is called the smallest harmonic 
majorant of s(P). If the positive part of s(P), which we call s+(P), satisfies 
the condition 


sup s*(p x) dx < @, 
0<p<l B 


then the smallest harmonic majorant h(P) exists and satisfies 


sup \h(px)| dx < @, 
0<p<l B 


Therefore, as s(P) = — foG(P, Q) du(Q) + h(P), where xu is a positive Borel 
measure on D and G(P, Q) is the Green’s function of D; and as h(P) satisfies 
the hypotheses of Theorem 5, the analogue of Theorem 5 for subharmonic 


*For the circle in the plane this is a result of Hardy and Littlewood (4). 
The surfaces B, are also C' and of bounded curvature. The curvature constant po’ for 
B,’ is po — p’. fp is defined by the Poisson integral over B,. 








166 K. T. SMITH 


functions will result from an analysis of the first term, the Green's potential, 
alone. Before stating the theorem we observe that the upper bound of |s(P)| 
along the various radii will be identically infinite whenever the Green's potential 
is infinite at the origin. Therefore a small sphere with center at 0 must be 
removed from D before taking the upper bounds. 


THEOREM 7. Let po be a fixed number between 0 and 1, and put 
$(x) = sup |s(px)| 
eg<ec<l 


for each x € B. If s(P) is subharmonic in D and if 


sup J sx) ae < oe, 
B 


0<p<l 


then for each «,0 < « <1, 


| 3(x)'““dx < @. 
B 


Proof. As we have mentioned, it results from Theorem 5 and the theorem 
of F. Riesz that we need only prove the theorem for functions of the type 
s(P) = - foG(P, Q) du(Q) = — u(P), where yu is a positive Borel measure 
on D and G(P, Q) is the Green's function for D. The explicit expression for the 
Green's function is well known. 





a ee i 
C1) OBO) = Ja — DLP — OF ~ TO PP oF): 
where 
?’ = ior Q, 


and w, 1s the area of the surface of the unit sphere. 

It is known that the Green’s potential u(P) = foG(P, Q) du(Q) either is 
identically + © or is finite except at a set of points of outer capacity 0. 
For the latter to be the case it is necessary and sufficient that ™ folr — |Q}) 
du(Q) < @. For each x € B and each real =, 0 < — < 2r, let C(x, —) be the 
sphere in B with center x and radius £; and let S(x, &) be the conical sector in 
D generated by joining each point of C(x, £) to the origin. Let 


1(x,8) = J - 10l) du(O), 


and let 
I (x, &) 


“This is clear for the sphere. In the case of more general domains the integrand r — |Q| 


is replaced by |xg — QI, the distance from Q to the boundary. In this form the fact was observed 
by Privaloff and Kousnetzoff (7). 








| 
il 


ooo = 


~" « 


| 





AN INEQUALITY OF HARDY AND LITTLEWOOD 167 


For the present we assume the following Lemma. 


LEMMA 3. There is a constant A such that i(x) < Am(x), where ii is defined 
like 5. 

The covering theorem is used as in the proof of the general Hardy-Littlewood 
inequality. Let B,, t > 0, denote the set of points x such that @(x) > ¢. If 
x € B,, then there is a &, such that 

I(x, &) — t 
[C(, &)| 7 A 
Choosing such a £, for each x € B, we have B, C > C(x, £,), so by the covering 
theorem there is a disjoint sequence 
C(x,, f,) (&, = fs.) 
such that 


Bi C 2 Clem, ts) 
If K is chosen so that for all C, |C(x, 4¢) < K|C(x, &)|, then ™ 


wo ow 


IB] < D5 Clea AEa)| < KD Clea, be) < =A DY Tem be) 


= *4 fr - |e dui), 


where E is the sum of the disjoint sets S(x,, &). Hence |B,| < k’/t for 
k’ = KA fo(r — |Q|) du(Q). Now 


J ace) "ae = fra — |B, | dt < k” few +K(1—«) \# 


where k” is larger than (1 — ¢) times the area of the surface of the sphere. 


Proof of the Lemma. We shall not give the entire proof. The calculations, 
which are routine, are achieved by majorizing the Green's function ((3.2) 
below) and considering separately the integrals over three different parts of 
the sphere. The majoration for the Green's function is obtained by inspection 
of the explicit formula (3.1).'* 


(3.2) There is a constant k such that 


0 < G(P, Q) < k(r — |PI)(r — |Q))/ IP — QI"; 


1 1 
G(P, Q) < wo, (m aa 2) \P ae - . 


also 





|] is used for subsets of D to refer to Lebesgue measure and for subsets of B to refer to 
the area measure on B. , 

“Essentially the same majoration and division of the sphere are used by Littlewood (6) to 
prove that in the case of the circle in the plane a Green’s potential has radial limit 0 at almost 
every boundary point. The majoration is valid for any domain bounded by a C*-surface of 
bounded curvature (11) and for even more general domains (5). 








168 K. T. SMITH 


The division of the sphere is as follows. Let P be fixed with |P| > por, let 
x = rP/|P|, and let & = r — |P|. One part of the sphere is the exterior of the 
conical sector S(x, &)); another is that part of S(x, &)) whose points Q satisfy 
r — |Q| > 4(r — |P)); the third is that part of S(x, &) whose points Q satisfy 
r — |Q| < 4(r — |P)). Finally, it is necessary to use an evaluation of |P — Q| 
in terms of the variable 





r Tr 
s0- rl. 
lal ° ~ PI 
(3.3) There is a constant k such that if |P| > por, then |P — Q| > ké. 


E= 


The Lemma results from simple calculation with these estimates and the 
remark that the quotient |C(x, £)|/£"-' is bounded above and from 0. 

Theorem 7 can be improved if it is known that the measure is the indefinite 
integral of a density subject to certain conditions. 


Tueorem 8. If u(P) = [oG(P, Q) f(Q) dQ where f(Q) is such that 


J - lovrevae < , p>, 


then ii(x) belongs to L? on B, and there is a constant M such that 
J acrar < ar f © - lonrnovae. 


Proof. The proof is similar to the proof of the last theorem, but it is possible 
to make use of the non-increasing rearrangements as in the proof of Theorem 1 
in order to obtain better evaluations. With the notations of the last theorem 
we have, as we had there, 


|B. < KD |C(xs, &)| = K|C|, C= > C(xn, En) 


Because of the disjointness it happens in Theorem 1 that 


<f Je- lens@ae 
(where again E = }°S(x,,&). From the fact that |E| = r/n|C|, it follows 
that 
A ('®¢ rA il 
t< IC] Jo g (s)ds = nlE| Jo 
for g(Q) = (r — |Q|) f(Q). Hence 
Kn Kn ,,{ n 
B,| < Kic| = 5" jz) < K* (1). 


The proof is finished in the same manner as the proof of Theorem 1. 


e"(s) ds = 4 ,(\z)) 

















AN INEQUALITY OF HARDY AND LITTLEWOOD 169 


Strong differentiability of double integrals. The general Hardy- 
Littlewood inequality yields a generalization of the theorem of Jessen, 
Marcinkiewicz, and Zygmund on the strong differentiability of multiple inte- 
grals (12). However, we need the inequality in a slightly stronger form. 


Theorems 1, 2, and 3 remain true and their proofs remain correct when f(x) 
is redefined to be the supremum of the averages of f(y) over all spheres containing x. 


THEOREM 9. Let B, and By be metric spaces with measures of the kind con- 
sidered in §2, and let f(x, y) be a measurable function on B, X B;. If 


J, J, Witoetipt aedy < «, 


then the indefinite integral of f is almost everywhere derivable in the strong sense. 
That is, for almost every choice of (x, y), 


: f(s, t) dsdt 


[Sen] Si.n Y Sin 





lim on 
n |Si.2| 


exists for all sequences {S;,,} and {Sz} of closed spheres such that x € Si .., 
¥ € San, 5(Sin) + 0, and 5(S2,) — 0. 


Proof (cf. 12, pp. 147-149). Several earlier theorems are required (notably, 
the Vitali covering theorem, the strong density theorem, and the theorem 
on the strong differentiability of the indefinite integral of a bounded function) ; 
these theorems are true in the present case, and the proofs given by Saks are 
valid after simple modifications. 


Remark. It was noticed by Hardy and Littlewood and by Calderén and 
Zygmund that the Hardy-Littlewood inequality leads to certain results on 
integral operators. The results are of such a kind as to establish dominated 
convergence of sequences of transforms. Thus, for example, Hardy and Little- 
wood show dominated convergence of the Fejer polynomials formed from the 
Fourier series of a function f; and Calderén and Zygmund show dominated 
convergence of singular integrals. Our general case of the inequality leads to 
similar results, which can be used, for example, to give another proof of Theorem 
4. However, since we do not have applications which would lead to new 
results, we shall omit the statement of this theorem on integral operators. 
In any case it is a re-phrasing in the abstract terms of the theorems of the 
authors cited. 


REFERENCES 


1. N. Aronszajn and K. T. Smith, Functional spaces and functional completion. To appear 
shortly in Ann. Inst. Fourier, Grenoble. 

2. A. P. Calderén, A general ergodic theorem. Ann. Math., 8 (1953), 182-191. 

3. A. P. Calder6én and A. Zygmund, On the existence of certain singular integrals, Acta Math., 
88 (1952), 85-139. 








170 K. T. SMITH 


4. G. H. Hardy and J. E. Littlewood, A maximal theorem with function-theoretic applications, 
Acta Math., 54 (1930), 81-116. 
5. M. Keldych and M. Lavrentieff, Sur une évaluation pour la fonction de Green, C.R. Ac. 
Sci. U.S.S.R., 24 (1939), 22-24. 
6. J. E. Littlewood, On functions subharmonic in a circle, Lond. Math. Soc., 2 (1927), 192-196. 
7. I. I. Privaloff and P. Kouznetzoff, Sur les problémes limites et les classes différentes de 
fonctions harmoniques et subharmoniques définies dans un domaine arbitraire, Rec. Math. 
Moscou, 6 (1939), 345-376. 
8. H. E. Rauch, Généralisation d'une proposition de Hardy et de Littlewood et de théorémes 
ergodiques qui s'y rattachent, C.R. Ac. Sci. Paris, 22 (1948), 887-889. 
, Harmonic and analytic functions of several variables and the maximal theorem of 
Hardy and Littlewood, Can. J. Math., 8 (1956), 171-183. 
10. F. Riesz, Sur les fonctions subharmoniques et leur rapport a la théorie du potentiel, Acta 
Math., 54 (1930), 321-360. 
11. A. Rosenblatt, Sur la fonction de Green d'un domaine borné de l'espace @ trois dimensions, 
C.R. Ac. Sci. Paris, 201 (1935), 22-24. 
12. S. Saks, Theory of the Integral (New York, 1937). 
13. C. de la Vallée Poussin, Propriétés des fonctions harmoniques dans un domaine ouvert 


limité par des surfaces @ courbure bornée, Ann. Scuola Norm. Sup. Pisa, (2) (1933), 
167-197. 


14. N. Wiener, The ergodic theorem, Duke Math. J., & (1939), 1-18. 





Unwersity of Kansas 





——— 








rt 


), 








HARMONIC AND ANALYTIC FUNCTIONS OF SEVERAL 
VARIABLES AND THE MAXIMAL THEOREM OF 
HARDY AND LITTLEWOOD 


H. E. RAUCH 


i. Introduction and principal theorems. The present paper, an 
edited excerpt from my dissertation,' arose from the suggestion of S. 
Bochner that I try to extend the maximal theorem of Hardy and Littlewood 
(2) to functions analytic in the solid unit hypersphere 


Son: 7? = |z,|? +... + ||? < 1. 


If one writes the analytic function of » complex variables, f(z;,..., 2), as 
f(r, P) where 


P ¢ Sani: lz,|? + eee + \z,.|? = l, 
then the theorem in question and its generalization are contained in 


THEOREM 1. If, for some X > 0, f satisfies 
(1) f lf, P)(dVrp << Cir <a, 
Stn-1 


where dV p is the volume element on Sx, at P and C* is a constant, then for the 
same 


(2) f (sup |f(r, P)|)\dVp < aC’, 
Sen-1 O4€r<l 
a, being independent of f. 
From Theorem 1 one can deduce a generalization of a classical theorem due 
to the brothers Riesz (7, Chap. VII): 


THEOREM 2. Under the same general hypotheses in and preceding Theorem 1, 
and assuming (1), there exists a function f(P) of class L* on Sx, such that 


(3) lim f(r, P) — f(P)|dVp = 0 


r+l S2n-1 


Received May 6, 1955. 

1Princeton, 1947. Abstracts of the results appeared as (3) and (4). The decision to publish 
in full, after so long a delay, is motivated by repeated requests of other workers in the field 
and by overlapping with published material obtained independently by Zygmund, Calderén, 
and others. A particular impetus is the preceding paper by K. T. Smith (5), in which a sub- 
stantial part of the underlying methods and results of my paper are obtained independently 
from a point of view not too different from mine. 


171 








172 H. E. RAUCH 


As Zygmund remarked (9), Theorem 2 follows immediately from Theorem 1 
and the theorem of Calderén and Zygmund to the effect that, under the same 
hypotheses, f(r, P) has a point-wise limit, f(P), almost everywhere. For the 
latter, convergence would be majorized according to (2) and would, therefore, 
imply mean tonvergence. However, in the less delicate sense \ > 1 (3) follows 
directly from (2) without the intervention of the theorem on point-wise 
convergence, as will be seen in §5. 

Now the proof of Theorem 1, to be found in §5, can be reduced by means 
of a sequence of theorems on analytic, harmonic, and subharmonic functions 


($§4 and 5) exactly as in (2) to the proof of a theorem of purely real-variable 
nature: 


THEOREM 3. Let f(P) belong to L?, p > 1, on 
ena? x32 + ...+2,* = l, 


and let o,(P) be the spherical cap of radius r (measured on S,_;) about P on 
S,-1 and V(r) its volume as measured on S,_;. Define f*(P) by 


* l ‘ . 
(4) f (P) = sup wd * \f(P’)| dV p-. 


O<r<er 
Then f*(P) satisfies 


(5) f {f"(P)\"dVe < C, “fi \f(P)|"dVp, 
n-1 Sa-1 

where C,,, depends only’ on n and p. If p = 1 this is no longer true; however, if 
|f(P)|logt|f(P)| is integrable, then 


6) J. f (P)dVp < B, f. If(P)| log* If(P)| dVr + Cy. 


The proof of Theorem 3 is the essence of the matter, and the method of 
analysis was supplied by Wiener in a profound paper (6). There he shows, by a 
reasoning which is closely related to F. Riesz’s proof of the case m = 1 of 
Theorem 3, but simpler and more powerful, that both Birkhoff’s ergodic 
theorem and the Hardy-Littlewood theorem for m = 1 have a common source 
and that both can be extended by the same method, the former to a theorem 
on averages over an m-parameter abelian group, the latter to a theorem on 
averages over Euclidean n-space. 

Now, Wiener in a lucid fashion reduces everything to a simple measure- 
theoretic lemma, which he calls “‘of Vitali type” although it is much more 
elementary. In studying his paper I noticed that this lemma, although formu- 
lated for sets in ordinary n-space, in fact applied to a more general situation 
from which, in particular, Theorem 3 would follow by Wiener’s arguments. 

A diagnosis of the elements needed explicitly or implicitly in extending 


*Wiener’s method does not deliver the best constants. Smith's paper (5) does. 


a 








m | 
ume 
the 
ore, 
ows 
vise 
ans 


ons 
ble 


, of 


| 
| 
| 


THE MAXIMAL THEOREM OF HARDY AND LITTLEWOOD 173 


Wiener’s argument to, say, the surface of the hypersphere leads one to de- 
scribe a metric space with a metric M and an outer measure m as having 
Euclidean character or Property A if, without regard to logical niceties, it is 
such that (i) spheres of equal radius in M have equal measure in m and vice 
versa (this very restrictive condition of homogeneity may be replaced by a 
much weaker one of a sort of uniformity in important cases); (ii) countable 
sets are null-sets; and, most important, (iii) the measure of the set y covered 
by a sphere a and all spheres overlapping ¢ and having smaller or equal radius 
satisfies m(y) < Cm(c) where C depends only on M and m. 

Then one has 


THEOREM A. In a space possessing Property A let a set S of outer measure 
m(S) be such that every P « S is the center of one member, o(P), of a certain 
family of spheres. Then given « > 0 there is a finite number of mutually disjoint 
members, o;, of the family such that 


(7) Z,m(o,) > C-' m(S) — « 
where C is the constant of Property A. 


For (n — 1)-space and S,_; (as will be seen) C = 3*"~'. The proof of Theorem 
A will occupy §2, and the deduction of Theorem 3, §3. 

The generality of Theorem A permits the immediate extension of Wiener’s 
generalization of Birkhoff's ergodic theorem to those groups of measure- 
preserving transformations of a set which admit an invariant metric possessing 
Property A and which may well be non-commutative. This application is in 
my dissertation ;* but I do not reproduce it here since there is already a surfeit 
of related ergodic theorems on the market. 

Besides the hypersphere there are other generalizations of the unit circle, 
notably the polycylinder: |z,;| =r; <1,...,|Z.| = 7, < 1 whose boundary 
is the multitorus, 

T,: |s:| = 1,...,|s,| = 1. 


The analogue of Theorem 1 for this domain was derived independently and 
announced almost simultaneously by Zygmund (8) in stronger form and me 
(3). I prove it again here not merely because the proof is different but because 
the technique of proof will serve to demonstrate a more interesting generaliza- 
tion (4): 


THEoreM 4. Let f(z:,...,2%n) be analytic in |z,| <1,...,|t,| <1 and 
satisfy 
Qe Qe 
(8) f saa if(rne™, aon » tne”) {dO wel Se Qecscce® € RADE 
0 0 ‘ 
then 


*The reference at the end of (4) to ergodic theorems for compact groups is erroneous or at 
least misleading. The theorems actually meant are analogous to ergodic theorems (6) but deal 
with averages over sets tending to zero (like derivatives). 








174 H. E. RAUCH 


2s 2s 
(9) J J, (sup|f(rie™, vee Pat") |}"d0, . . dO < gC 


where a,, C* have the same meanings as before and A is the region, for fixed 

6;,...,,, described by 

(10) 0<i—"<K; inj=mil,...,n, 4 fj, 
a 

K being any positive constant. 


The necessity for (10) is related to the fact that the values of a function 
harmonic in each |z,| < 1 are determined solely by its boundary values on T,, 
which is thus a “distinguished boundary surface’”’ in Bergman’s terminology. 

If in Theorem 3 and Theorem A one observes that when dealing with T,, 
spheres may be replaced by hyper-cubes, while C = 3", then one will see that 
Theorem 4 follows from them as Theorem 3 does—provided, that is, that one 
proves the more delicate version of the connecting link between Theorems 1 
and 3 (§4). 


2. Proof of Theorem A. Consider any point P of S whose «(P) overlaps 
only a finite number of o(P). This certainly implies that we can find some 
sphere (not a o(P)!) with P as center such that within this sphere there is 
no other point of S. This, by familiar reasoning, implies that the set of such 
points P is denumerable, hence of measure 0. Let us, therefore, discard this 
set and the o(P) belonging to it and ignore them in further reasoning. 

Let B, be the least upper bound of the radii of ¢(P). Obviously one may 
assume B, < . Otherwise, the theorem is trivial. 

After this remark I shall, in fact, prove (7) with S replaced by the set 
consisting of the union ¢ of all ¢(P), and m(S) replaced by m(c) — « for any «. 
This, of course, will prove (7), since ¢ contains S. 

Let V(B,) be the volume of a sphere of radius B;. By the definition of B,, 
one can choose ¢(P,) such that its volume V; > V(B,) — }¢ (obviously 
Vi < V(B,)). Let o; be the set consisting of the union of o(P;) and all ad- 
joining o(P). The volume of a; is not greater than CV(B,). In fact, let ¢ 
be a sphere of radius B,; about P,; (not a o(P)!) and hence of volume V(B)). 
Now «@; is certainly contained in the union of é and all spheres adjoining it. 
But these latter are certainly of radius <B, by the definition of B,. Hence 
Property A implies that the union in question has volume <CV(B;). Let 
V(B.) be the least upper bound of the volumes of those o(P) not in o;. Choose 
such a o(P:) whose volume V2 > V(B:) — $e. Let o2 be the union of o(P2) 
and all adjoining o(P) not in o;. As before, the volume of o2 is <CV(B:). 

Continue this process inductively. One obtains a sequence o(P,), obviously 
disjoint, with volumes V, subject to these inequalities, where the V(B,) are 
defined similarly, 


)» (vey) - s) < > Ve < m(c). 


kel 








THE MAXIMAL THEOREM OF HARDY AND LITTLEWOOD 175 


Since >>, V(B,) is convergent V(B,) — 0. This implies that the union of o, 
where o; is defined inductively as above, exhausts all o(P); for, if it did not, 
but omitted, say, one o(P’), then from some k’ on V(B,) would equal the 
volume of o(P’). 

Summing up, one has ¢ = Ye,. Therefore, 


Zm(o) <= V(B,) 
but 
Vi > 2 V(B;) — «> Em(o) wie 


Now one chooses K so that 


and one has finally 
K 
d ale) —20< 3 Vp 
4 =I 


LEMMA 1. Theorem A applies to S,_, with spherical caps as the «(P) and to 
T, with hypercubes, —@ < 0, < ¢, (¢ = 1,...,m), as o(P), where C = 3™"' 
in the first and C = 3” in the second case. 


Proof. The first part is obvious as is the very last statement. In dealing with 
S,~1 I remark that 


V, = Cyr f sin*~*6 dé. 
0 


Now the volume of a sphere of radius 7 plus those adjoining it of smaller r 
radius is certainly less than or equal to that of a sphere of radius 3r. Since 
3r < 2, 


sin 3r = 3sinr — 4sin*r < 3sinr 


so that sin*-* 3r < 3*-? sin*-' r. Therefore 
ar r r 
f sin” 6 dé = 3 f sin” °*3 @’de’ < 3""* f sin” *6d6 . 
0 0 0 


3. Proof of Theorem 3. The key to Theorem 3 is the important 


THEOREM 5. The measure of the set S, of points P for which f*(P) > a does 
not exceed 





i 
— J, uPiiavs 


It also does not exceed 


os 
; Sree PI dVp. 











176 H. E. RAUCH 
Proof. For each P € S, by definition one can find an rp such that 


Sven dVp > V(rp)a. 


By Lemma | one can find a finite number of the op(rp) whose total measure 
exceeds 3-“- m(S,) — ¢. One has then 


J. \f(P)|dVp > Su@riave > g1m(S.) a —e€ 


where = is the finite set of op(r,p). The last statement is proved as follows: 
Let h(P) = |f(P)| when |f(P)| > 4a, otherwise zero. Let h*(P) be defined in 
the same manner as f*(P). Obviously, we have f*(P) < h*(P) + 4a. Conse- 
quently m(S,.) < the measure of the set of P for which h*(P) > 4a, which 
by the preceding part of the theorem is 

a—l n—1 

<*3—f[ npyave = 72 ff sippy aver. 
a Sa-1 a | 


I(P) |>he 





A similar proof yields 
THEOREM 6. Let f(P) belong to L on the multitorus T,. Let y(P) be the “cube” 
9,-%959:9469, + ¢ 


with P as center and side 2. Then the measure m(S,) of the set of points P 
where 


f*(P) = Lub. ss J f(P’) dVp >a 
o<ece 2G J y(P} 
1s 
3" 
<= fi ipeyiave, 
a Ta 
where dV p is the volume element on T,,. It also does not exceed 


2.3" 


a \s(P) |>}a 


\f(P)| dVp. 


Proof of Theorem 3. Let m(x) be the measure of the set of points where 
\f(P)| > x, and m*(x) be the measure of the set of points where f*(P) > x. 
If's(x) is any non-negative increasing function of x then 


f s(f{(P)) dVp = — Js@ dm (x) 


ll 


J. s@@nave 


- fse) dm* (x) 
(7, p. 242). 





ire 


De 


re 


THE MAXIMAL THEOREM OF HARDY AND LITTLEWOOD 177 


Since, from Theorem 5, 


a 
; ShreroyfP14¥e = . J. y dm(y), 


by formal substitution and interchange of integrations we have 


J meee) Pde < — 2.3" f “dx J dm(y) 





m*(x) < 





.- 
x 


23°71 Jam (y) eax 


= ean 5 J: y’dm(y). 


But this latter 
7 
wi ny FF lf(P)?'dV>p 


and is, therefore, finite. As a consequence 
2 
lim m*(x) x? ~"dx = 0; 
Esa VE 


however, since m* “ is a ieee function 
2e 


(ae) 9° 2—! t = me(2e) f Pde < fmt (a) Pax; 


therefore, lim eee. = 0, and we can integrate by parts, getting 


J, wewev, =- J, eame(n) <; <= ze SS. WPypar, 


which is (5) with C,, = 2?.3"-'/(p — 1). 

The second statement has been proved by Hardy-Littlewood (2) for 
n = 2. The third statement has a similar proof. This time we notice that for 
the same reasons 


ome) ae < — 23 [yim gy JE = — 23 fy log 2y amy) 


< 23 f [Pylog* /(P)|aV> + 230g 2 f P| aVe. 


Integrating by parts and noting that 
lf] < e + [f| log*|f| 


and 
1 
J m* (ede < (5.1) 


we have (6). These constants are not as good as those of Hardy-Littlewood 
in the original case, m = 2. 











178 H. E. RAUCH 


THEOREM 7. Let f(P) belong to L?, p > 1 on T,. Then if f*(P) is defined as in 
Theorem 7 


J r@vrave < Gs J yPrrave 


where C, 5 depends only on n and p. This is no longer true for p = 1; however, if 
|f(P)| logt|f(P)| ts integrable on T,, then 


S r@rave < Bf P)\ 0g" WP)| ave + 6, 
where B, depends only on n. 


The proof is exactly like that of Theorem 3 with appropriate changes. 
It will now be seen immediately that Theorem 3 can be extended to an 
arbitrary space with Property A, where 3*-' is replaced by C. 


4. Theorems which relate radial suprema to averages. 


THEOREM 8. Let f(P) belong to L on S,_;. Let u(r, P) be the harmonic function 
in S, which takes on the values f(P) on S,_;. If 
U(P) = sup |u(r, P)| 
0<r<1 
then U(P) < A,f*(P), where f* (P) is defined as in Theorem 3 and A, is a constant 
depending only on n. 
Proof. Define polar coordinates in n-space: 
x, = rcos 4, 
X_ = 7 sin 0; cos 62, 


Xn-1 = 7 sin 6,... sin 6,2 COS 0,_1, 
rsin@,... sin 6,_1. 


# 
I 


For fixed P we may assume that P is the point x; = 1, x, = 0;4 > 1, in which 
case 0, becomes the geodesic distance of any other point on S,_,; from P. 
We have then 


u(r, P) = + J ] f d On—2 a P,,(r, 63) sin*~*6,f(Q)w-d 0,—1 
where 


_ 2x)" 
rin) 


is the Poisson kernel for the sphere, 


u ~— r + n—3 . 
(1 — 2rcoso+ry"’"*~ ™ 62... Sin O2, 





and P,(r, @) 





_ 





nt 


r- 





——, 


—— 


oc 


THE MAXIMAL THEOREM OF HARDY AND LITTLEWOOD 179 


and Q has coordinates (0;,...,0,-1) (1, Chap. IV). Now let us observe that 
in order to prove the lemma, it is sufficient to prove |u(r, P)| < A,f*(P) 
where A, is fixed and independent of r. We integrate by parts, then, with 
respect to @, and obtain 


u(r, P) = 2 1S p.6, r) J ao, ‘aa f : f(Q) sin”*6,0d8,_, 


— Fao, PaO), fF a0,... f at-af(Q) sint*ouae 


where the coordinates 4 Q’ are (0, 62,...,0,-1). Recalling the definition of 
f*(P), more explicitly 


f*(P) = Lub(c, ff sior-*0 a0) fi J dB, . = i dé,-1|f(Q)| -sin*"*8- w|ao 


where 
= Jao... f wd6,,_; = Wa-1; 
0 9 
therefore 


\u(r, P)| < K,f*(P) + arma is (r, a)| : ff sint*0 ao] d6, 


* (7.03) 0 -ao 


where K, is a constant depending only on n. The last expression (2, p. 107) in 
brackets is <D,. This completes the proof. 


THEOREM 9. Let f(P) belong to L on T,,. Let u(r, ..., 1», P) be the function 
which is harmonic in the polycylinder, 





<P K, 


P,: |s| <1(¢ = 1,...,n) 


and which assumes the values f(P) on T,. Then, if f*(P) is defined as in Theorem 
7, we have 


sup |u(ri,... 5%» P)| < Aaf*(P) 


where A is the region of OK 7, <1, (¢ = 1,...,), described by (10). Ag 
depends only on A (i.e., K) and n. 


Proof. I remark that the latter restriction seems quite essential as is 
evidenced not only in the proof but by the implications of the lemma (cf. the 
next section). I observe again that one can assume that P is the point 6, = 0 
(¢ = 1,...,m) on T7,. Because of a few complications it will be easier to 
present here the proof for » = 2 only. The extension to arbitrary n is straight- 
forward. 











180 H. E. RAUCH 
First, 
1 vr se 
u(r, T2, P) = re f f P(r, 6) P(re, 62) f(A, 62) d6,d0, 


where P(r, 6) = (1 — r*)/(1 — 2cos@+1r*) is the usual Poisson kernel. 
Repeated integration by parts and interchange of integration gives, after 
taking absolute values 


lu(ru ra P)| < ty anton ff" iyo, 0,)| dod, 
52 [reas f umeta 
HEB [renal ff meta 
+f gi Pra bs) rie 


(sgn 6; -sgn 62) Cs aT (6, @’)| anaw’\ aod | 























62) 





= pall + In + I; + I). 


Recalling the definition of f*(P) one has I; < 4x*f*(P). To get an inequality 
for I, (and I;) one observes that the inner double integral is less than or equal 
to 


SS Wee. 0.) ado, < ae'pr(P), 


Furthermore, 


1 — fe 


1+ fr. 


-1=4 - [ 2 r¢ 6;) do +f 4 pie ao, | 
ltr o 6; ibe , _» db; ris 01) dbs 
per be (ene en 
bard 4 olin Ait +: <&. 


Therefore J; (and I;) < 4x? . 4Kf*(P). Next, in I, the inner double integral is 
less than or equal to 


, by 6: 
62 (sgn9,)| 62] f f ’ |@1| < |@2|, 
sgn 6, -sgn af f < — a 
0 Jo 6; 6; 
| f f , [Ox > [62]. 
—6; —6, 





"Id 
y |e Pon.) d6; 


| 
| 


eee 


| 
| 


— 





THE MAXIMAL THEOREM OF HARDY AND LITTLEWOOD 181 
We split up J,, 
nm ff 4 


| 6; 1<| es! | @;1>1 6, 


which with the previous remark gives us 


h< srr) | f Sy, Pr 61) Fe P(rs, 62)| « |@;|2d0,d6, 


le, 1>| 60| 





d d 
s SJ |S Peo) P(r2, 82) | |0a'doxde |, 


}@, |<] 6] 
the first integral in brackets is less than 
. | . | 7 | 
6 sin 6 
2er,(1 — 7; f | dé. f 
mri ( rn) _» | (1 — 2r; cos 0 + nyt aad - 


- a+ ny. f ___ 6 sin@ 
< Piel + 1G rif 1 — fr - i — 2r,cos0 + 7;) 

The integral on the right is less than or equal to a constant C (the 
reasoning is the same as in the Hardy-Littlewood reference in the proof of 
Lemma 3). Therefore, this first integral in brackets (and similarly the other) 
is less than or equal to 164K C. Setting 

A, = (1+ 8K + 32KC/z), 

one completes the proof. 





d 
6 P (re, @) | dé 








dé. 





5. Proofs of Theorems 1, 2, and 4 and related theorems. 
THEOREM 10. Let f(P) belong to L’?, p > 1 on S,_. Let u(r, P) be the function 
harmonic in S, with boundary values f(P). If U(P) = sup |u(r, P)|, then 
0<r<l 
1 . . i = . 
. {U(P)}"dVp < aa If(P) dV, 
V; Sa-1 y 1 Sy-1 
where C,,, is a constant, depending only on n and p. For p = 1 this is not true. 
However, if |f(P)| log*+|f(P)| is integrable on S,_,, then 


4 f u@ave cB f [my|tog* Py ave +S 
Vi 4s Vi Sn-1 J 1 


where B, and C,, depend only on n. 


Proof. The first and third statements are corollaries of Theorem 3 and 
Theorem 9. The second statement has been proved in (2) for m = 2. Asa 
corollary of Theorem 10 we have 


THEOREM 11. Let u(r, P) be harmonic in S, and let it be such that 


(11) +f u(r, P)|"dVp < C? 
V; 8n-1 








182 H. E. RAUCH 


for all r < 1, and fixed p > 1. Then if U(P) is defined as in Theorem 10 


(12) - [U(P)\"dVp < CuC? 
1 Sn-1 
Proof. Let 
U,(P) = sup _fu(r, P)|; 
0<r<R<l 
then 


Ef twalPyrave < St ff u(R, PyPaVr < Gs” 
Vi Sa-1 Vi Sna-1 


by Theorem 10. Now Uz(P) T U(P) as R— 1. Hence, Lebesgue’s monotone 
convergence theorem completes the proof. 


THEOREM 12. Let u(r, P) satisfy (11). Then there exists a function u(P) € L? 
on S,—1 such that 


(13) lim J jue, P) — u(P)/'dVp = 0 
ral 
Proof. WHélder’s inequality implies 


1 1 
V, At P)|dVp< (2 J. 


As a result 


l/p 
|u(r, P)|’d Ve) <¢.”. 


F(r, S) = J ue, P) dVp, 


where S is a measurable subset of S,_;, constitute a set of absolutely contin- 
uous set-functions on S,_, which are uniformly bounded. According to Radon’s 
theory of integration one can form, for any ® continuous on S,_,, the Radon- 
Stieltjes integral 


f __ 4F(r, S). 


n-l 
Now it is a classic theorem of Radon that from the uniformly bounded set of 
set-functions F(r,S) on S,; one can extract a sequence F(r,,,S) and find 
another bounded set-function F(S) such that, for any continuous , 


(14) lim $dF(r,, S) = f @ dF(S) 
™—<o Sn-1 Sa-1 
where the r,, — 1, otherwise the theorem is trivial. 
Now 
(15) IF, S)< f lulr, PI dVe < f UP) ave 
Ss 8 


so that the F(r, S) are uniformly absolutely continuous, since U(P) by (12) 
and Holder’s inequality belongs to L. Therefore, by choosing # in (14) to be 


——- 


ea 








L-_ 





Tg 


THE MAXIMAL THEOREM OF HARDY AND LITTLEWOOD 183 


the characteristic function (rounded-off) of S one sees that F(S) is also 
absolutely continuous, and, therefore, the integral of a point function u(P) 
of class L. Accordingly, if one picks @ in (14) to be the Poisson kernel one finds 


(16) u(r, P) = J. _Palr, Q) 4(Q) dVe. 


One also sees from (15), by applying Lebesgue’s differentiation theorem, 
that |u(r, P)| < U(P) almost everywhere so that u(P) is in L? by (12). 

The reasoning in (7, p. 85) using (16) completes the proof of (13). 

To prove (2), I observe first that a subharmonic function w(r, p) satisfying 
(11) also satisfies the analogue of (12). The proof of this is reduced to (12) 
by the device of the harmonic majorant of w(r, P) and is to be found in (2, 
p. 113, footnote 1) which carries over word for word to several variables. 

Next, following Hardy and Littlewood, I set w(r, P) = [f| and observe that 
(1) implies that w(r, P) satisfies (11) with p = 2 > 1. That |f|” is subharmonic 
is a simple consequence of the mean-value theorem for the function f® which 
is analytic in the neighborhood of any point where f # 0. Then (2) follows 
immediately. 

The proof of (9) follows from Theorems 6 and 9 through the intermediary 
of theorems analogous to 10 and 11 exactly as in the proof of (2). 

Finally (3), Theorem 2, follows from Theorem 12 when A > 1 since 
IRf| < |f| and |If| < |f| and the convergence of f follows from that of Rf 
and Jf by Minkowski’s inequality. When A = 1 the Hardy-Littlewood in- 
equality is still valid for f, unlike the harmonic function; therefore the reason, 
ing of Theorem 12 may be repeated to account for this case. 


Obviously, a similar theorem may be deduced for the polycylinder (cf. 
also 10). 


REFERENCES 


1. R. Courant and D. Hilbert, Methoden der Mathematischen Physik, vol. 11 (Berlin, 1937)° 

2. G. H. Hardy and J. E. Littlewood, A maximal theorem with function-theoretic applications, 
Acta Math., 54 (1930), 81-116. 

3. H. E. Rauch. Generalisation d'une proposition de Hardy et Littlewood et de théorémes 
ergodiques qui s’y rattachent, Comptes Rendus, 227 (1948), 887-889. 

, A Poisson formula and the Hardy and Littlewood theorem for matrix spaces, Bull. 
Amer. Math. Soc., 55 (1949), Abstract 250. 

5. K. T. Smith, A generalization of an inequality of Hardy and Littlewood, Can. J. Math., 8 
(1956), 157-170. 

6. N. Wiener, The ergodic theorem, Duke Math. J., 5 (1939), 1-18. 

7. A. Zygmund, Trigonometrical Series (Warsaw, 1935). 

8. , On the existence of boundary values for regular functions of several complex variables, 
Bull. Amer. Math. Soc., 54 (1948), Abstract 502t. 

, A remark on functions of several complex variables, Acta Szeged, 12 (1950), 66-68. 

, On the boundary values of functions of several complex variables, Fund. Math., 36 

(1949), 207-305. 


4. 











9. 
10. 





University of Pennsylvania 








AN EXTENSION PROBLEM FOR FUNCTIONS WITH 
MONOTONIC DERIVATIVES 


W. B. JURKAT 


Introduction. This paper deals with questions of the following type. 
Problem (A): Let F(x) be the mth integral of a positive non-decreasing function 
for all large positive x, the problem is to find a function f(x), being the mth 
integral of a non-decreasing function for all x (— ~ <x < @), with the 
property 

_ F(x), for all large positive x; 
(A) f(x) -{0 for all large negative x. 


Problem (A) can be considered as a special case of the boundary value 
problems, which we discuss in §2. Roughly speaking, the question is here what 
values may be assumed by the mth integral of a monotonic function and its 
first m derivatives at the boundary of an interval. It is no loss of generality 
to suppose the left-hand boundary values to be equal to zero as can be seen 
by subtracting a suitable polynomial. Then the solution of the problem 
directly depends on the solution of the reduced Hausdorff and Stieltjes moment 
problems, for the latter of which we give a new approach (§1). 

The method indicated leads in a simple manner to a complete solution of 
problem (A), depending on the behaviour of certain quadratic forms (§3). 
The main result of the paper consists of determining this behaviour for a 
large class of functions F(x), for which therefore problem (A) can be settled 
(Theorem 6). 


1. Some reduced moment problems. If a finite sequence yu, (vy =0,...,) 
is given, the reduced Hausdorff moment problem is to determine a non- 
decreasing function y(t), 0 < ¢ < 1, such that 


1 
(I) » = [ ravw (v = 0,...,m). 


The following result is esse..tially due to Achyeser-Krein (1; see also 3; 
5, pp. 29-30; 6, p. 77). 


THEOREM 1. A necessary and sufficient condition that the moment problem 
(I) should have a solution is that, in case n = 2m, both quadratic forms 


m—1 


>> rrers| = Hi+9+2) X Xs 
i, j= 


(1) = Bi+ 5 XX yy 
ty j=0 


Received March 4, 1955. This research was supported by the United States Air Force, 
through the Office of Scientific Research of the Air Research and Development Command, 
under Contract Number AF 18 (600)-691. 


184 


i—V—6<—_ 





— 


ee 





ne 





af 


—- 





FUNCTIONS WITH MONOTONIC DERIVATIVES 185 


should be non-negative, whereas in case n = 2m + 1, both quadratic forms 


m 
(2) Do Bersieey De (ues od Mer y+) X Xs 
1, j= 


i, j= 
should be non-negative. 
If a finite sequence u,(v = 0,..., m) is given, we also consider the reduced 


moment problem of determining a non-decreasing function y(t), 0 < t < T, 
such that 


(II) — f rav(t) 


with suitable T > 0. 
If (II) has a solution for a certain T = To, then it also has for all T > 7». 
Replacing yu, by w,/T’ and x,/T* by x, in Theorem 1 we obtain at once 


THEOREM 2. A necessary and sufficient condition that the moment problem (II) 
should have a solution is that for n = 2m or n = 2m +- 1 respectively 


m m—1 m—1 
(3) ) Hue x, > 0, > Mit gear ity < re i+ 541% Xy 
1, j= i, j= 0 i, j= 
or 
m m m 
(4) pi Mit gpi1X Xy > 0, > Mie peiX ity S rz. Min fh Xs 
i, j= i, j=0 i, j= 0 


should hold for all values of x, and a suitable T > 0. 


COROLLARY. A sufficient condition is that for n = 2m or n = 2m +1 
respectively the quadratic forms 


m m—1 
(5) ym Mi+ 7% 1X3, p> Mii 41% oy 
1, j= 1. j= 
or 
m m 
(6) > Hit 541% X3, Do per eees 
1, j= 1. j= 


should be positive definite, whereas a necessary condition is that they should be 
non-negative. 


It is worthwhile to point out the close connection between problem (II) 
and the reduced Stieltjes moment problem of determining a non-decreasing 
function y(t), 0 < t < @, such that 


(IIT) Me = foray (» = 0,...,m). 


Necessary and sufficient conditions for the moment problem (III) to have a 
solution are due to Verblunsky (7), who based his argument on certain alge- 
braic lemmas of E. Fischer. By means of Theorem 1 or 2 we are able to give a 
new and very simple approach, avoiding with Verblunsky the theory of 








186 W. B. JURKAT 


continued fractions. This approach will also give us more detailed information 
about the solution. 

First we infer from Theorem 13b of Widder (8, p. 138) that a necessary and 
sufficient condition that the moment problem (III) should have a solution y(t) 
with infinitely many points of increase is that the quadratic forms (5) resp. (6) 
should be positive definite (cf 8, p. 6). The necessary part is trivial and the 
sufficient part follows by inductive definition of ua41, uay2,---- 

Now the Corollary to Theorem 2 shows that, if there is a solution of (III) 
with infinitely many points of increase, then there also is a solution of (II). 
If there is a solution of (III) with a finite number of points of increase (that is, 
a step-function with finitely many steps) the same conclusion is true. Con- 
versely every solution of (II) also gives a solution of (III), such that the 
problems (11) and (III) are equivalent. The conditions of Theorem 2, now also 
valid for problem (III), are slightly different from those of Verblunsky and 
have the advantage of using only the known values yo, . . . , us- 

With the help of a mean value theorem for systems of integrals (4, p. 97) 
we further see that, if there is a solution of (II), then there also is a solution 
of (II) by a non-decreasing step-function with finitely many steps. Hence, if 
(III) has a solution, then (III) even has a solution with a finite number of 
points of increase. Therefore the conditions of Theorem 2 also are necessary 
and sufficient for the moment problem (III) to have a solution y(t) with a finite 
number of points of increase. 

Similar arguments can be used for the reduced Hamburger moment problem 


(IV) = f t’dy(t) @m@... eh 

2. Some boundary value problems for functions with monotonic 
derivatives. We consider the following problem (B): Given real numbers 
cy(v = 0,...,), X, and T > 0, find a function f(x), which is for X — T < 
x < X the nth integral of a non-decreasing function ¢(x) and satisfies the 
boundary value condition 


3) ('X-T)=0, f("(*Me=e (» = 0,...,n), 


where f™ (x) is to be identified with ¢(x) by definition. (For points of con- 
tinuity f™ (x) = ¢(x) holds by itself, and for other points f™ (x), 2 >1, is not 
defined a priori.) Besides (B) we introduce the problem (B’) differing from 
(B) only in the possibility that T > 0 may be chosen suitably. 

If problem (B) or (B’) has a solution, then for X — T <x <T and 
y = 0,...,2 wehave 





7) i) =f @- nae: 
in particular, 
(8) = aon, — t)*"d¢(t). 


oe 





- 


| 








FUNCTIONS WITH MONOTONIC DERIVATIVES 187 


Conversely, if (8) holds with a non-decreasing function ¢(x), where we may 
assume ¢(X — JT) = 0 without restriction, then 


(9) fe) = +f @-nrdow 


is a solution of (B) or (B’). 
By a change of variables the condition (8) can be written in the form 


Tr 
(10) vi Ga» = f t'dy(t) (» = 0,...,%), 
0 
or 
(11) vic.,/T” = f t'dy(t) (y» = 0,...,%), 


with non-decreasing functions y(t). 
Thus we have proved the following results: 


THEOREM 3. Problem (B) has a solution if and only if problem (1) has a 
solution with 


tp = vw! Cy_,»/7” (» = 0,...,#). 


THEOREM 4. Problem (B’) has a solution if and only if problem (11) has a 
solution with 


pr = v! Cap (» = 0,...,%). 


Explicit conditions can be taken from Theorems 1 and 2. 


3. Extension theorems for functions with monotonic derivatives. 


We now consider problem (A) at the beginning of this paper. A simple argu- 
ment shows: 


THEOREM 5. Problem (A) has a solution if and only if problem (B’) has a 
solution with 


c, = F(X) (» = 0,...,m) 
for all large positive X(X only denoting numbers, where F(X) exists). 


Explicit conditions can be taken from Theorem 2 by means of Theorem 4. 
We shall only use the special conditions of the Corollary of Theorem 2: 


COROLLARY. A sufficient (necessary) condition that problem (A) should have 
a solution is that for n = 2m or n = 2m + 1 respectively the quadratic forms 


(12) >) G+)! F°*?(X) xe, > G@+7+ 0! FOO (X) xe, 
i, jn 


i, jn 
or 


(13) DO G+Ft DFR) cay, OFDM) xe, 
1, j= i, jm 








188 W. B. JURKAT 


should be positive definite (non-negative) for all large positive X (X only denoting 
numbers, where F™(X) exists). 


In general it is rather difficult to decide whether the forms (12), (13) are 
positive definite or not. But there is a certain class of functions F(x) for 
which we can give a complete solution. These are the L-functions or logar- 
ithmico-exponential functions in the sense of Hardy (2, p. 17). 


THEOREM 6. Necessary and sufficient that the problem (A) should have a 
solution for a L-function F(x) and n > 2 is that F(x)/x" should be non-decreasing 
’ for all large positive x. For n = 0, 1 there is always a solution. 


Proof. The existence of a solution for = 0,1 follows at once from the 
Corollary and the fact that F™(X), F(X) are positive for X ~ + o. 
From now on we may assume m > 2. Using the elementary properties of 
L-functions and our supposition on F(x), 


(14) FM (x) > 0, F(x) >e > 0, x—+>+ @, 
we only have to discuss the following cases: 

(a) xFC+) (x) > 6 > 0, x—+-+ @ 
or 

(b) F(x) == L(x), 0<8< L(x) =ollogx), xt, 


with the possibilities, for x + + ©, 
(b;) L’(x) = 0, (bz) L’(x) > 0, (bs) L’(x) <0 


We shall show that (A) has a solution in cases (a), (b;), (bs), where F(x)/x* 
is non-decreasing for x — + @, and that (A) has no solution in case (bs), 
where F(x)/x* is strictly decreasing for x — + @. All this together will prove 
the Theorem. 


Example. If F(x) = x* + x*"', problem (A) has a solution for n = 1, 
but no solution for n > 2. 

In case (a) we use the sufficient part of the Corollary and restrict ourselves 
to the form 


(15) > oem Fo i 1X) xay, 

i, j=0 
where x, X‘ is replaced by x; We assume that (15) has its minimum value 
M(X) on 2x? = 1 for x, = x,(X). It is enough to show that M(X) > 0 
holds for X — + ©, or for any sequence X = X,-—> @ such that x,(X,)—&, 
fori = 0,...,m, with Zé? = 1. 


For vy = 0,...,m and X > xo (xo large enough) we have 
1 ¢* (Sot 
(16) p?-“(x) - tf (X aa x)’ F®* (x) dx + > yt po +O, ) 
° Zo t=0 


ee 








rE 


| 


FUNCTIONS WITH MONOTONIC DERIVATIVES 189 


and therefore 


M(X) = i (¢ aoe) z= 2)" 5.) Foe) de 


i—0 


+ res $a.) + O(X™"). 


Hence, taking X = X, and using (a), 


M(X) > ef (z (¥ a= (x — 3)" x.) dx + O(X™) 


t< a aie x ey 
X So it+g+1 X™ 





>> + O(x™") 


“ TF 
>sd = +jt+ it o@) 


1 
> sf (Sex) dx + 0(1). 
0 t=O 


Sf (SaxVex >, De =1 


i=0 


Since 


we obtain M(X,) > 0 for X, ~ + @. 


The same method can be used to show that the other forms in the Corollary 
are positive definite also. 


In case (b:), F(x) = cx" holds for x — + © with positive c. Then a solution 
of the problem (A) is given by the function 


fia fex’, x > 0, 


lo, x <0. 
In case (be), instead of (16) we use the Leibniz formula 
(17) 43 POX) = L(X) + Be ' r ’) LSX) GF oo a 
for vy = 0,...,” and X — + o, and the asymptotic formulae 
(18) XL’'(X) = o(L(X)) 
and 
(19) XIL@O(X) = (—1)F-(E — 1)! XL'(X) + o( XL’(X)) 
for ¢ = 1,2,... and X — + o, which are consequences of (b) and Hardy 
(2, p. 37, line 5). From both together it follows fory = 0,...,n and X ++ @ 
that 
(20) Zi FOX) = L(X) + yXL(X) + 0(XL'(X)), 











190 W. B. JURKAT 








where 
- §(*-*)\( ye 2G-—)! 
- = ( t yc OF! 
a-—? 1 
Gt) =F ("> ")—n f'a ay sas 
fl 0 
1 , n 
= Ga On ae, y= (),. ,n 
Hence 


M(X) = La0( ¥ =.) + o(XL'(X)) 


msn fi(d-a-2)'-0-of BE 


Now let 


X=X,, > & ¥ 0. 


t=O 


Then 
M(x) = LX)(¥ &) + LEX), 


and on account of (b) we get M(X,) > 0 for X,-—+ + o. In the remaining 
case 


DD & =0 (m > 1) 


t=—0 


we have for X = X, 


(23) M(X) > XL'(X) f (Se. a@- »)') a + 0(XL'(X)). 


Since 
1 m 2 m m 
f (Se a - x)') => 0, 2 & = 0, Lt = 1, 


we find M(X,) > 0 for X,-—> + @. The same method can be used to show 
that the other forms in the Corollary are positive definite also. 

In case (bs) we use the necessary part of the Corollary and show that the 
minimum M(X) of the form (15) on =x? = 1 is negative for X + + @. 
Taking £, with 


> & = 0, 2 =1 (m > 1), 


we obtain similarly to (22) and (23) 





— 


FUNCTIONS WITH MONOTONIC DERIVATIVES 191 


M(X) < em F°--9(X) £8, 


< xi) f' (Sea-)') #4 ox) 
< 0, X—++ @, 


This completes the proof of Theorem 6. 


REFERENCES 


1. N. Achyeser and M. Krein. Uber eine Transformation der reellen Toepliteschen Formen und 
das Momentenproblem in einem endlichen Intervalle, Commun. Soc. Math. Kharkoff (4), 
11 (1935), 21-26. 

2. G. H. Hardy, Orders of infinity (Cambridge Tract, 1924). 

3. L. Kantorovit, On the moment problem for a finite interval, C. R. (Doklady), Acad. Sci. 
URSS (N.S.), 14 (1937), 531-537. 

4. G. Kowalewski, Integralgleichungen (Berlin, 1930). 

5. M. Krein, The ideas of P. L. Ceby¥ev and A. A. Markov in the theory of limiting values of 
integrals and their further development, Upsekhi Matem. Nauk (N.S.) 6 (1951), 3-120. 

6. J. A. Shohat and J. D. Tamarkin, The problems of moments (New York, 1943). 

7. S. Verblunsky, On a problem of moments, Proc. Cambridge Philos. Soc. 45 (1949), 1-4. 

8. D.V. Widder, The Laplace transform (Princeton, 1946). 


Uniwersity of Cincinnati 











ON RELATIONSHIPS AMONGST CERTAIN SPACES OF 
SEQUENCES IN AN ARBITRARY BANACH SPACE 


C. W. McARTHUR 


1. Introduction. Let X be a Banach space (B-space). A sequence {s(i)} 
in X is unconditionally summable if and only if every rearrangement of the 
series }_ ,s(i) is convergent. The set of unconditionally summable sequences 
in X will be written as U(X). In this paper several classes of summable se- 
quences in X will be compared with one another. Each class to be considered 
is identical with U(X) when X has finite dimension. 

The following notation will be used. The set of natural numbers will be 
denoted by N and the collection of non-null finite subsets of N by F& A se- 
quence in X will usually be denoted by the single letter s and its value at 
i € N by s(i). If s is a sequence in X and F € A the sum of the terms s(i) such 
that i € F will be written > ps(i). 

A sequence s in X will be called weakly unconditionally summable if and only 
if > A f(s(a))| < @ for every f € X*, the adjoint space of X. Let B(X) stand 
for the set of weakly unconditionally summable sequences in X. Gelfand (4) 
has shown that s € B(X) if and only if sup[||S-rs(a)||: F € A] < ©.With 
the usual definitions for addition of sequences and multiplication of a 
sequence by a scalar B(X) is a vector space. It is known that B(X) is a B-space 
with the norm of each s € B(X) defined by ||s|| = sup[|/-rs(a)||: F € F]). 
This will be the norm intended when B(X) is referred to as a B-space in the 
sequel. As a consequence of a result of Birkhoff (2), U(X) is a closed linear 
subspace of B(X). 

Following Hadwiger (5), a sequence s in a B-space X has an invariant sum 
if and only if there is an x € X such that x = }°,s(i) and such that x is the 
sum of each of the convergent rearrangements of >" ,s(i). Let JS(X) stand 
for the class of sequences in X with an invariant sum. It is known that if X 
has finite dimension then U(X) = JS(X). Hadwiger (5) has shown that if X 
is a Hilbert space with infinite dimension then U(X) is a proper subset of 
ITS(X). In this paper Hadwiger’s result is sharpened and extended to any 
B-space with infinite dimension. 

If s is a sequence in X and there is x € X such that x = }>,s(i) then x 
will be called the sum of s. In case there is x }> X such that f(x) = }>;f(s(a)) 
for all f € X* then x will be called the weak sum of s. It follows easily that a 
sequence s in a B-space X can have at most one weak sum. It can be shown that 
in any B-space X there are sequences which have a sum but are not elements 
of B(X). Conversely, in some B-spaces, for example, in X = ¢o, the B-space 
of real sequences which converge to 0 with ||s|| = sup[|s(z)|: i € N] for each 


192 


— 


———— 


~~ -—— — = 





SEQUENCES IN BANACH SPACE 193 


s € ¢o, there exist sequences which are elements of B(X) but which do not 
have sums. 
Two new closed linear subspaces of B(X) are introduced in this paper. 
They are 
B,(X) = [s € B(X): s has a weak sum], B,(X) = [s € B(X): s has a sum]. 
For any B-space it is true that 
U(X) C BX) = IS(X) \ B(X) C By(X) C BCX). 


We show that if X = co then all of these containments are proper. 


2. Closed linear subspaces of B(X). Dunford (3) and Gelfand (4) have 
shown that a sequence s in a B-space X is weakly unconditionally summable 
if and only if there is a real number M such that >> ,/f(s(z))| < M||/f|| for all 
f € X*. A norm for the vector space of weakly unconditionally summable 
sequences in X is defined by setting 


IIs|], = sup( df(s(@))|: f € X* and |/f|| < 1) 


for each sequence s of this class. Let B’(X) denote the normed vector space of 
weakly unconditionally summable sequences in X with the norm of the 
preceding sentence. As a special case of a result of Dunford (3, Theorem 30) 
we have that B’(X) is a B-space. 

The following lemma is essentially given by Pettis (6, Theorem 3.2.2.). 


LeMMA 2.1. If s is weakly unconditionally summable then 
sup|||D rs(i)||: F € F] < suplLdf(s(@))|: f € X* and |If|| < 1) 
< 2 sup[]|X rs(i)||: F € F). 
LEMMA 2.2. The normed vector space B(X) is complete. 


Proof. Since B(X) and B’(X) differ only in their norms and B’(X) is 
complete it is evident from the relationships between their norms given in 
Lemma 2.1 that B(X) is complete. 


THEOREM 2.3. For any B-space X the spaces B,(X) and B,(X) are closed 
linear subspaces of B(X), and the operation L defined on B,,(X) to X by setting 
L(s) equal to the weak sum of s for each s € B,(X) is linear and has norm 1. 

Proof. To show that B,(X) is closed in B(X) suppose s, is a sequence 
in B,(X) which converges to s € B(X). For each nm € N let x, denote the 
weak sum of s,. Since {s,} is a Cauchy sequence in B(X) there is for each 
«> 0 a natural number n, such that ||s, — s,,|| < «/2 if n, m > n,. For 
n,m > n, and f € X* with ||f|| < 1 one has 


Lf (Xm _ Xn)| < D id f(s. (a) aig Sm(i))| < 2\ Is. sins S|| < 6 


the second inequality given by Lemma 2.1. It follows that {x,} is a Cauchy 








194 Cc. W. MCARTHUR 


sequence and therefore has a limit x. Again, suppose « > 0 is given and f € X* 
with f non-zero. There is an m, such that 


Ilse — s|| < €/(AIIfI)) n>, 
and since x, converges to x, m, may be chosen large enough so 

Ile — xall < €/(2|/f[I) n>. 
Hence, if » > m, then 


If) — Tf) < If) —f@)| + CAdf(.@ — s@)| 
< IIfil(e/ C2ILAID) + 2ILfIl Ilse — sll < ¢, 


using Lemma 2.1 to get the second inequality. This proves that x is the weak 
sum of s. 

To show that B,(X) is closed in B(X) suppose {s,} is a sequence in B,(X) 
which converges to s € B(X). For each nm € N let x, denote the sum of s,. 
Since B,(X) C B,(X) and B,(X) is closed, s has a weak sum x. Also {x,} 
converges to x. Since {x,} converges to x and {s,} converges to s, if « > 0 is 
given there is p € N, dependent on ¢, such that ||x — x,|| < ¢«/3 and ||s, — s||< 
«/3. Also since x, = }°,s,(i), there is a g € N such that if r > g then 


| r 
%p— Do 5p(i) 





= 








Hence if r > g, then 




















2— ¥s0|| <le-all+ ||» - 50]| 
+ > (4) - > sti | ¢¢é 











This shows that x is the sum of s. 
It remains to show that L is a linear operation with norm 1. Let 


E = [f: f € X* and |[f|| = 1). 
Fix s € B,(X) and let x = L(s). Then 


| = supllf)|:f € £] = sup tim| 3 70s) | 24 € 2 | 
cool Aon coh 2] 
sur| sup | ¥ 510) | 7 € Bim CH 


Hence L, which is obviously additive, is continuous and ||L|| < 1. Since for 
any Xo € X the sequence {xo, 0,0,...,6,...} is in B,(X) and has xp for its 
norm, clearly ||Z|| = 1. 





> s(i) 


t=—1 


im € wv] < IIs]. 




















eas: = 


1 


SEQUENCES IN BANACH SPACE 195 


3. Extension of a theorem of Hadwiger to B-spaces. The following 
theorem is obtained by applying a modification of Hadwiger’s argument (5) 
to the general case. 


THEOREM 3.1. Jf X is a B-space the following are equivalent: 
(i) X has infinite dimension. 

(ii) the difference IS(X) ~ B(X) is non-void. 

(iii) U(X) is a proper subset of TS(X). 


Proof. Because of the well-known fact that U(X) C IS(X) (\ B(X) for 
all X, it is evident that (ii) implies (iii). Since U(X) = IS(X) if X has finite 
dimension, (iii) implies (i). It will now be shown that (i) implies (ii). By a 
remark of Banach’s (1, p. 238), X contains a closed infinite dimensional 
linear subspace Xo which has a basis {x(i)} with ||x(¢)|| = 1,7 € N. Using a 
result of Banach (1, pp. 110-111), there is a sequence {f,} in X* such that 
f(x(j)) = 54; and for each x € Xo, x = F fi(x)x(i). 

Consider the sequence of finite blocks 


B, = {x(k)/k, —x(k)/k,...,x(k)/k, —x(k)/k}, k= 1,2,3,... 


where B, consists of 2k? terms each of which is either x(k)/k or —x(k)/k 
according as it is in an odd or an even place in B,. Note that x(k)/k occurs 
k? times in each B, so the sum of the odd place terms in B, has norm k. Con- 
struct a sequence s in X by adjoining the second block of terms to the first, 
the third block to this, etc. Since the norm of the sum of the odd place terms in 
each block is k, s ¢ B(X). Clearly } «s(i) = @. It remains to show that s has 
an invariant sum. Suppose that s’ is a rearrangement of s and that y= >" ;s’ (i). 
Since Xo is closed, y € Xo. Express y by its biorthogonal development 
y = Df i(y)x(a). For arbitrary 1 € N, we have f,(y) = 3 4f:(s’(j)). Take 
no large enough so that all terms in the block B, occur in the sum 


s’(1) + s’(2) +... + 5’(mo). 
If m > mo then 


L fds’) = AS) + Zfds'i)) 
= e ‘ 
where F = [j: 7 < m and s’(j) is a term of B,] and 
F’ = [j:j <nandj ¢ FI. 
Now > es’(j) = 6, and by biorthogonality f,(s’(j)) = 0 if 7 € F’,sof;(y) = 0. 
Since f,(y) = 0 for all i it follows that y = 8. 


4. Comparison of subspaces of B(X). For any B-space X, U(X)CB(X) 
so clearly U(X) C B,(X). Also B,(X) C IS(X) for any B-space X, because if 
s € B,(x) and s has the sum «x and if s’ is a rearrangement of s with sum x’ 
it follows that f(x) = f(x’) for all f € X* sox = x’. With these observations 
the following lemma is obvious. 








196 C. W. MCARTHUR 


LEMMA 4.1. For any B-space X, U(X) C B,(X) = IS(X) OB(X) C 
B,(X) C B(x). 

A B-space X is weakly complete if and only if every weakly convergent 
sequence in X is weakly convergent to an element of X. 


THEOREM 4.2. Jf X is weakly complete then 
U(X) = B,(X) = IS(X) (1) B(X) = B,(X) = B(X) C IS(X). 
The containment is proper if and only if X has infinite dimension. 


Proof. For any B-space, U(X) C IS(X) and it is well known that when X 
is weakly complete that U(X) = B(X). Hence B(X) C JS(X) when X is 
weakly complete. The theorem then follows by Lemma 4.1 and Theorem 3.1, 


LemMMA 4.3. If for a B-space X, U(X) is a proper subspace of B(X), then 
U(X) ts a proper subspace’ of B,(X). 


Proof. Suppose s € B(X) ~ U(X). For each k € N let B, denote a block 
of 2k terms as follows: 


B, = {s(k)/k, —s(k)k/,...,5(k)/k, —s(k)/k}. 


that is, the even place terms in B, are s(k)/k and the odd place terms are 
—s(k)/k. We construct s’ € B,(X) ~ U(X) by adjoining the terms of the 
block B, to those of B, and then adjoining the terms of B; to these, etc. 
Clearly @ = >> «s’(i) and for each f € X*, 


Lilf(s’@)| = 2ilf(s@)| < &, 


so s’ € B,(X). Finally, since s ¢ U(X) it follows that the series > ;s’ (i) has a 
subseries, namely, > ;s’(2i — 1) which does not converge unconditionally. 
Hence s’ ¢ U(X). 


COROLLARY 4.4. The B-space U(co) is a proper subspace of B,(co). 

Proof. Consider the sequence {s,} in co where for each n, s,(i) = lifi =n 
and s,(i) = 0 if i # nm. The sequence {s,} is an element of B(co) but it does 
not have a sum so is not an element of U(co). The corollary follows by 
Lemma 4.3. 


LemMA 4.5. If for a B-space X, U(X) is a proper subspace of B,(X) then 
B,(X) is a proper subspace of B,(X). 


Proof. lf s € B,(X) — U(X) then there is a permutation ¢ of N such that 
the sequence {s(t(z))} does not have a sum. Let x denote the sum of s. Then 
x is the weak sum of s and since s € B(X) it follows that x is the weak sum of 
{s(¢(2))}. 


By Corollary 4.4 and Lemma 4.5 we have the next corollary. 


COROLLARY 4.6. The space B,(co) is a proper subspace of By(co). 


— o_o 





—_— <{—.. 





SEQUENCES IN BANACH SPACE 197 


Lemma 4.7. If for a B-space X, U(X) is a proper subset of B(X) then B,(X) 
is a proper subset of B(X). 


Proof. By hypothesis there exists an s € B(X) — U(X). Using a result of 
Orlicz (1, (3) on p. 270), there is a strictly increasing sequence ¢ of natural 
numbers such that the sequence {s(t(i))} does not have a weak sum. However 
it obviously inherits the property of belonging to B(X) from s. 


COROLLARY 4.8. The space By(co) is a proper subspace of B(co). 


Proof. Since B(co) — U(c¢e) is non-void the conclusion follows by Lemma 
4.7. 


Putting together the preceding corollaries we have the following 


THEOREM 4.9. For the B-space co, U(co) C B,(c¢o) C Bu(co) C Bleo), and 
each containment is proper. 


REFERENCES 


1. S. Banach, Théorie des opérations linéaires (Warsaw, 1932). 
2. G. Birkhoff, Integration of functions with values in a Banach space, Trans. Amer. Math. Soc., 
88 (1935), 357-378. 

N. Dunford, Uniformity in linear spaces, Trans. Amer. Math. Soc., 44 (1938), 305-356. 

. I. Gelfand, Abstrakte Funktionen und lineare Operatoren, Mat. Sbornik, N.S., 46 (1938), 
235-284. 

5. H. Hadwiger, Uber die konvergenzarten unendlicher reihen in Hilbertschen raum, Math. 
Zeit., 47 (1941), 325-329. 

. B. J. Pettis, On integration in vector spaces, Trans. Amer. Math. Soc., 44 (1938), 277-304. 


Pe 


eo 


1The author is indebted to the referee for the present form of Lemma 4.3 which is simpler 
and more general than the original. 

Received March 25, 1955; in revised form October 28, 1955. This paper is from Chapter V 
of the author's dissertation, On unconditional summability of sequences in semi-groups with a 
topology, Tulane University, August, 1954. It was done under Contract N7-onr-434, Task Order 
III, Navy Department, Office of Naval Research. The author thanks Professor B. J. Pettis 
for helpful suggestions regarding this paper. 


Alabama Polytechnic Institute 








ON A QUASI-LINEAR EQUATION 
RICHARD BELLMAN 


1. Introduction. The purpose of this note is to establish some limit 
theorems for the non-linear recurrence relations 


N 

1.1 x(n + 1) = Max >> a,,(g) x,(n), i= 1,2,...,Nin>0, 
q j=l 

under certain assumptions concerning the initial values c, = x,(0), and the 


coefficient matrices A(q) = (a;,(q)). 

Equations of this type occur in various parts of the theory of dynamic 
programming, as we shall indicate below, and are, in addition, of interest in 
furnishing a link between the theory of linear and non-linear operations, as 
we have discussed elsewhere (1). 

Generally speaking, these equations arise in the consideration of processes 
of Markoff type, see (2), in which decisions are made at various stages of the 
process. 

Results corresponding to those obtained below hold for the more general 
equations of the form 


N 
| Max >> ai3(g) x,(n), i=1,2,...,K <N, 

1.2 xi(n+1)=/ ¢ “4 
| LD a4;(q") x(n), +=K+1,...,N, 

‘ yal 


where g* in the lower equations is determined by the upper equations. 


2. The homogeneous equation. Let us consider the equation 


N 
2.1 hy, = Max D> a,,(g) y, Ea Seer 8 
G 


j=l 


where we impose the following conditions: 


2.2 (a) g = (¢1, G2, - . - » @w) runs over some set of values, S, with the property 
that the maximum is attained in (1), 

(b) © > m>a,,(q) > 0 (4,7 = 1,2,..., N) for g ES, 

(c) for any qg, let ¢(g) denote the characteristic root of A(q) = (a,;(q)) 
of largest absolute value, the Perron root, known to be positive. We assume 
that there exists at least one value of g for which ¢(g¢) assumes its maximum 
for g € S. 


Received July 20, 1955. 
198 


on 


- 





LS oa —_— 


or Ee ee 


ON A QUASI-LINEAR EQUATION 199 


We shall now prove 


THEOREM 1. Under these conditions, there exists a unique positive \ with the 


property that 2.1 has a positive solution, y, > 0 (i = 1,2,..., N). This solution 
is unique up to a multiplicative constant, and 
2.3 A = Max ¢(g). 

qeS 


Proof. We begin by showing the existence of a positive \ and a positive set 
of solutions {y,}. Consider the region defined by 


N 
x >0, Dm= 1. 
i-_ 


The normalized transformation 


N N N 
2.4 yi = [ Max 3 eu) |/[ S Maxk a4;(q) »,| ’ 


is a continuous mapping of this region into itself. Hence there exists a fixed 
point, {y,}. This fixed point is a solution of 2.1, with \ the denominator in 
2.4. Each component y, is positive because of the positivity of a,,(q). 


To show that this solution is unique up to a multiplicative constant, let 
[u, z] be another solution of 2.1 with » > 0 and z a positive vector. Let {gq} 
be the set of values for which the maximum is attained in 2.1 and {@} the similar 
set associated with z. Observe that we may have different sets for each i. We 
have then 


2.5 AY: 


2 a:4(@) 9) > Dd a15(9) »;, i= 1,2,...,N, 
us, = > @43(Q) 25. 


Let us now assume, without loss of generality that \ < yu. Let « be a positive 
constant chosen so that one, at least, of the components y,; — ez, is zero, one 
at least is positive, and the others are non-negative. This can always be 
accomplished if y and z are not proportional. If i is an index for which y, — ez, 
is zero, we have 


N 
2.6 0 = uly: =~ €2;) > Ay: — U2; > 2 250); can €Z;) > 0, 
D aad 


since a,;(9) > 0, a contradiction. Hence \ = yu, and y and z are proportional. 

To show that \ = Max ¢(q), we proceed as follows. It is clear that \, as 
the characteristic root of some A(q), satisfies the inequality 4 < wu, where 
u = Max ¢(qg). Assume that actually \ < yw. Let z = (2;,2,...,2,) be a 
positive characteristic vector associated with uw and @ a set of g-values which 
yield u» = $(g). Then we have 


N N 
2.7 uz, = Dd a:;(9) 2; < Max >> @;,(9) Zz}. 
j=l « =! 








200 RICHARD BELLMAN 


Since y, is positive, we can find a positive constant m such that 2, < my, for 
4#=1,2,...,N. Hence 2.1 yields 


N 
2.8 uz, << ™m Max 2 a.,(9) yy = mry:. 
¢ = 


Thus 2, < my,/u. iterating this, we obtain 2; < my, (A/yz)*, for arbitrary k. 
Since A/u <1, by assumption, this yields z, = 0, a contradiction. Hence 
A = wp. 


3. The recurrence relation. Let us now return to the recurrence relation 
of 1.1 and prove 


THEOREM 2. If, in addition to the conditions of 2.2, we assume that there is a 
unique q for which the maximum value of $(q) is attained and that c,> 0, 
then 


3.1 x(n) ~ ay, d’, 


as n— ©, where a is a constant dependent upon the initial values c;. 


Proof. Let us take c; > 0 without loss of generality. There are then two 


positive constants k and K such that ky; < ec, < Ky, (¢ = 1,2,..., N). Let 
us show inductively that 
3.2 ky," < x(n) < Ky, ". 


Assume that we have the result for ”, then 
N 
3.3 x(n + 1) < KX" Max >> ai,(q) yy = KX"*"y, 
q j=l 


N 
> kX" Max >> au;(q) yy = RA**y,. 
@ j=l 


To establish the asymptotic behavior we show that for m sufficiently large 
the set of g’s which furnish the maximum in 1.1 is precisely the set which 
yields’ = Max ¢(q). 

Assume the contrary. This means that infiitely often we employ a set {@} 
which is not identical with the g which furnishes the maximum in ¢(q). 

We then have, fori = 1,2,..., N, 


N N 

3.4 x,(n+1)= 2X a5(@) x(n) < (¥ ou y,) yy 
j= j= 

For some index 7 we must have 


N 
3.5 > 44;(9) 93 < Ve 
= 


with strict inequality. For if 


N 
Dd a45(G) vy > Ay 


j=l 








| 


._— 


ON A QUASI-LINEAR EQUATION 201 


for all i, the characteristic root of A(g) = (a,,(g)) of largest absolute value, 
¢(g), would at least equal A = Max ¢(q), which would contradict the assump- 
tion concerning the uniqueness of the maximum of ¢(q). 

Hence, for some component, say the first, we have 


3.6 xi(m + 1) < OKX"*"y,, 0<é@<1. 


Since a;,(q*) > 0 for i, j, where g* is the value of g for which \ = ¢(g*), we 
see that, fori = 1,2,...,N, 


N 


3.7 x(n +2) < xx] XL ai(¢") ¥1 + 0a;,(q") m| < 0,Kx"**y,, 
I= 


where @ < 1. 
If therefore a set of g's distinct from g* are used R times, we obtain 
3.8 x(n) < 6;"Kx"y,, 


for m sufficiently large. Since 0 < 6, < 1, if R is too large we eventually 
contradict the lower bound for x,(m). 

Hence for m > mo = mo(c;), we have 
3.9 x(m +1) = A(q*) x(n), 


whence the asymptotic statement of 3.1 follows. 


4. A dynamic programming problem. Suppose that we are engaged 
in a multi-stage decision process of the following type. At each stage we have 
our choice of various operations, which we number i = 1, 2,..., K. The ith 
operation has a probability distribution attached with the following properties: 


4.11 There is a probability p, that we receive k units and the process continues, 
i. Sere 


4.12 There is a probability p,» that we receive nothing and the process 
terminates. 


How do we proceed so as to maximize the probability that we receive at 
least m units before the process terminates? 
Let us define the sequence 


4.2 u(m) = the probability of attaining at least m units before the termination 
of the process using an optimal procedure. 


Then using the intuitive “principle of optimality’ (1), we see that u(n) 
satisfies the recurrence. relation 


R 
43 u(n) = [Max 3 paw(n = »| 2>6, 
1, n< 0. 











202 RICHARD BELLMAN 


Using methods similar to those above, we see that for large n, 
4.4 u(n) ~ cp", 


where p is the root of largest absolute value, necessarily positive, of 


R 
4.5 l= > pap, 


k=l 


for the value of i which maximizes p. 


5. An analogue of a result of Markoff. Markoff showed that if 


N 

5.1 x(n +1) = > aye, (n) (n = 0,1,...) 
j=l 

and x,(0) > 0, with the conditions 

5.2 ay > 0, >> ay, = 1, (i = 1,2,...,N), 

7 
then 
5.3 lim x,(") = c, So ae 


where c depends on the initial values. 
The same proof shows that the same result holds for the sequence defined 
by 


N 
5.4 x(n +1) = Max 2 41;(q) x;(n), 
¢ > aes 


provided that the conditions in 5.2 hold uniformly in g. The constant will, 
of course, in general, be different from that above. 


REFERENCES 


1. R. Bellman, The theory of dynamic programming, Bull. Amer. Math. Soc., 60 (1954), 503- 
516. 
2. W. Feller, An introduction to probability theory and its applications (New York, 1950). 


Rand Corporation, 
Santa Monica, California 


\ 
| 


ng ae 





| 
| 


ee 


MODIFIED BOUNDARY VALUE PROBLEMS FOR A 
QUASI-LINEAR ELLIPTIC EQUATION 


G. F. D. DUFF 


1. Introduction. The quasi-linear elliptic partial differential equation 
to be studied here has the form 


(1.1) Au = — F(P, 4). 


Here A is the Laplacian while F(P, u) is a continuous function of a point P 
and the dependent variable u. We shall study the Dirichlet problem for (1.1) 
and will find that the usual formulation must be modified by the inclusion of a 
parameter in the data or the differential equation, together with a further 
numerical condition on the solution. 

The negative sign on the right in (1.1) is included for convenience and also 
to emphasize that the behaviour of the right side will be the opposite of that 
usually studied. We shall generally take F(P, u) to be a positive increasing 
function of u, these conditions being motivated by the following physical 
problem. Consider an equilibrium distribution of heat in a medium where the 
source density of heat generated depends on temperature u: 


p = p(u) = F(P,u). 


That p and hence F(P, u) in (1.1) should be positive and increasing with u 
is a natural assumption. 

The known results for quasi-linear equations such as (1.1) are, roughly 
speaking, of two kinds: local theorems, and in-the-large existence proofs for 
equations 


(1.2) Au = + F(P, u) 


where F(P, u) is an increasing function of u. By local theorems are meant 
those in which some restriction of size is placed on the boundary values, the 
domain, or the non-linearity of the function F(P, u). Among these we might 
include the case when F(P, u) is bounded independently of u. The Dirichlet 
theorem and various other boundary value results have been proved in such 
circumstances. (2; 4, vol. II, Ch. V; 6, Ch. II). 

On the other hand, global existence theorems for (1.2) have been found by 
many authors. (3; 6, Ch. II). The possibility of this may be recognized if one 


Received August 11, 1955. This paper was written at the 1955 Summer Research Institute 
of the Canadian Mathematical Congress. The author's thanks are due to the National Research 
Council of Canada for a fellowship held at this time. He is also indebted to Professor J. Leray 
for a stimulating discussion and helpful comments. 


203 








204 G. F. D. DUFF 


constructs the equation of variation of (1.2) with respect to an external 
parameter: this variational equation has the linear form 


(1.3) Av = F,(P, u) », 


with positive coefficient F,(P, u). Such equations satisfy a maximum principle 
in the sense that the maximum absolute value of any solution is taken on the 
boundary. Thus a priori estimates can be found for the solutions of (1.3) and 
hence for those of (1.2). 

These methods will not apply to (1.1). Even in the linear case, it is evident 
that the usual statement of the Dirichlet problem, namely the assertion that a 
solution having given boundary values exist, does not hold unless F(P, u) is 
restricted in some way. Indeed, if \ is an eigenvalue, solutions of Au + Au = 0 
have boundary values restricted by one or more conditions of orthogonality. 
This particular case will be relevant to Theorem III below; we shall later 
furnish a similar example which pertains to the main Theorem I and which 
shows that the conventional Dirichlet problem is not then always solvable. 

This discussion suggests that we should frame boundary value problems 
for (1.1) in such a way that some a priori bound can be included in the state- 
ment of the problem. We will show that in a certain sense it is sufficient to 
bound the solution from above. In fact we assume that the actual maximum 
of the solution has a stated value. If, however, one additional numerical 
condition is assigned, it is evident that a corresponding degree of freedom 
should be allowed for the boundary values of the solution. This we shall 
permit by introducing a parameter ¢, of the nature of an eigenvalue parameter, 
into the boundary condition. Thus the main theorem asserts the existence of a 
solution with a stated maximum and with boundary values proportional to a 
given function. 

We then establish some variations of this theorem, allowing the parameter 
to appear in various ways in the differential equation instead of the boundary 
condition. These solutions have an assigned maximum together with given 
boundary values. We conclude with a Neumann boundary value theorem for an 
equation similar to (1.1) but containing an additional linear term. 


2. Preliminaries. Let Vy be a Riemannian manifold of dimension N 
with positive definite metric of class C‘ in a given coordinate network: 
ds? = ay dx‘dx*; 


then the Laplace operator has the form 
=. _ Ou 
2.1 tw = + 2 Voa* 2). 
(2.1) “ Va ax" Va ” ox" 
where a = |aq| and the associate tensor a“ satisfies 
a"a,, = 5*;. 


We consider a compact domain D of Vy, having a boundary surface B of 
class C? in the above coordinate system. Points of D will be denoted by capitals 


—_—_ |e oo 


—_— — >> 


EE ——  — 


ee 








al 


yf 


| 
| 


—_— >> 


— LL —— _— 


MODIFIED BOUNDARY PROBLEMS 205 


P,Q,...and points of B by lower case p, g, . . .. We assume that F(P, u) isa 
continuous function of P and u together; other conditions appropriate to each 
theorem will be stated separately. All functions and parameters used are 
real-valued. 

The existence proofs which follow will be based on the Schauder-Leray 
theorem (5), which we will state here. We work with the separable Banach 
space C, of continuous functions on the closure of the domain D, with the norm 


(2.2) \u|| = max |u(P)| . 
PeD 


Let &2 be a bounded domain of C, with boundary 0’, and let 7,[u] be an opera- 
tor defined in 2 + 2’ which satisfies the conditions 

(a) 7,[u] is jointly uniformly continuous in k and u, for 0 < k < 1 and 
uENQ=24+7. 

(b) 7,[u] is a compact or completely continuous operator, transforming 
bounded sets into compact sets (1, Ch. VI). Suppose also that the equation 


(2.3) u = T,[u] 


has no solution on 2’ for 0 < k < 1, and that for k = 0 it has a solution in Q. 
Finally, let 
v=u— T,{u] 


be a homeomorphism of C. Then the conclusion of the Schauder-Leray theorem 


is that the equation 
(2.4) u = T,{u} 


has at least one solution in Q. 

Separate choices for 7,[u] and for Q will be made in each of the following 
theorems. In each case condition (b) above is satisfied essentially because the 
integral operator with kernel the Green’s function of D for Au = 0 is com- 
pletely continuous in the space C. We include a demonstration of this in the 
proof of Theorem I. 


3. The modified Dirichlet problem. Let M be a positive number 
given in advance, and let f(p) be a C' function on the boundary B which is 
positive: 

(3.1) 0 < myo < f(p) < Mo < &. 


We also take my < 1, which is always possible, for a reason which will appear 
later. Let F(P, u) be a positive non-decreasing function of u. 


THEOREM I. There. exists a solution of (1.1) with maximum value M and 
boundary values proportional to f(p). 


The constant of proportionality being denoted by ¢, we have 
(3.2) u(p) = tf(p) 








206 G. F. D. DUFF 


and also 
(3.3) max u(P) = M 


PeD 


To establish the existence of such a solution, we begin by constructing the 
harmonic function vo(P) with boundary values f(p). Thus 


(3.4) Avo(P) = 0, vo(p) = f(P), 


and in view of the maximum principle for harmonic functions and (3.1) we 
have 


(3.5) mo < min f(p) < vo(P) < max f(p) = Mo, 


these inequalities holding for P € D + B. 
Now a solution of (1.1) with boundary values éf(p) satisfies the integral 
equation 


(3.6) u(P) = J GOP, Q) F(Q, u(Q)) dVe + too(P), 


where G(P, Q) is the harmonic Green’s function of the domain D. We note that 
G(P, Q) is non-negative (2). Conversely, a solution of (3.6) is actually a 
solution of (1.1) with boundary values ¢/(p), as may be verified by operating 
on (3.6) with the Laplacian, and noting that the integral on the right has 
vanishing boundary values. We observe that to satisfy the maximum condition 
(3.3) we must make an appropriate choice of t, which will in turn depend on 
u(P), so that a fixed value for ¢ cannot be determined at this stage. 
We therefore define the non-linear functional 


(3.7) T,[u](P) = fev. Q) F(Q, ku(Q)) dVq + t[u] v0(P), 


where ¢ = #,[u] is so chosen that 
(3.8) max 7,[u](P) = M. 
PeD+B 


Since vo(P) satisfies (3.5) we see that such a choice of ¢ is always possible, 
since the right side of (3.7) is a strictly increasing function of ¢ tending to 
+ © with ¢. 

We now show that #,[u] is bounded, provided that 0 < k < l and u < K, 
where K is a fixed constant. Let 


(3.9) A = max f G(P, Q) F(Q,K) 4Ve; 
PeD D 


this number exists and is positive. Now if A < M it would appear that 
i,[u] in (3.7) should be positive. However it is clear that 


t,[u] < M/Mo, 


since My = max (P) and 7,[u] < M. This furnishes an upper bound for 
t,[u]. If M < A, t,[u] may be negative; however 


Ee — i — — we 


Se ee el 








~~ -§ ON YY OT 


or 


\ 
| 


= 


MODIFIED BOUNDARY PROBLEMS 207 


J G0, & FC, ku(Q)) aVe < A, 


since ku < K for 0 < k < 1 and F is a non-decreasing function of u. Thus the 
multiple of vo(P) required to reduce the maximum of 7;,[u] to M does not 
exceed (A — M)/mpy. We therefore have 


A-—-M 
mo 
This shows that if u is bounded above, ¢,[u] is bounded below and in fact 


bounded. The lower bound depends on A and hence on the upper bound K 
of u. 


Since the integral in (3.7) is non-negative, it follows that 7,[u] is bounded 
below: 


(3.10) - 





M 
< 4,[u] < a 


(3.11) -~4=s2 





Ms < T,[u]. 


Combining (3.8) and (3.11), we see that 7,[u] is bounded in both directions. 
To apply the Schauder-Leray theorem, we set 


K = 2M/m>o > M, 


and let A be defined by (3.9) with this value of K. Then we choose @ to be the 
connected domain of C defined by 


(3.12) a: - 2 - Mi-o coy <=, é > 0. 
Mo mo 

The boundary ®’ consists of those functions u for which equality holds on 

either side for one or more points of D + B. Now 7;[u] is defined on 2 + 0 

and is continuous in both k and u. This is easily verified since F(P, ku) is 

uniformly continuous in ku, and the integral 


Je (P,Q) dV 


is a continuous function of P, vanishing on B, and so is bounded. Thus the 
integral in (3.7) depends continuously on ku and so, therefore, does ¢,[u)]. 
Hence 7;[u] is continuous in k and u together. 

We now show that 7,[u] is a compact operator in C. Let {u,} bea uniformly 
bounded sequence of continuous functions. From (3.8) and (3.11) we see that 
T,[u,] is bounded, independently of m and P. We now show that the sequence 
T,[u,](P) is equicontinuous in P by forming the difference 


(3.13) [Ts[un] (Ps) — T,[u)(P1)| < + tilua]| vo(P2) — v0(P1)| 


+f Gs, Q) - GPs, Q)| FCO, m(Q)) d¥e 


Since u, is bounded independently of , so is F(Q, u,) and also ¢,[u,]: let Fo 








208 G. F. D. DUFF 


and %) be bounds for the absolute values of these sequences. Thus the pre- 
ceding difference is less than 


Fo Jice. Q) — G(P1, Q)| dVeq + to] v0(P2) — v0(P:)|. 


The second term here tends to zero as P; — P;, since o(P) is continuous. 
To estimate the integral containing the Green’s functions, we suppose that the 
distance s(P, Q) < 6 and denote by S, a geodesic sphere of radius 9 about P,. 
For P # Q, G(P, Q) is continuous, and we can therefore choose 6 so small that 
for Q € D — S,, the difference 

IG(P2, Q) — G(P1, Q)| < «a. 


We then write 


J lees. @ - GPs, @|4Ve 
(3.14) < fp_s.|GPs Q — GPs, | 4Vo 
+ J.t6@s Q) + G(Ps, Q)} dVe 


<a fpiVet Jg6P.Q)dVot+ JGR. Q) aVe 


Here § is a sphere of radius 2y about P,, which certainly contains S, if 
S(PiP:) < 9. Since 


1 -N+2 - 
GP, Q) ~ =a (P.O, P09, 
the integrals over small spheres converge like 


1 2 
Joe. Q)4Ve~aN 3)" — 0, n— 0, 


uniformly with respect to P in D. Given « > 0, we choose 9 so small that the 
second and third terms on the right in (3.14) are each less than }¢«. We can 
then choose 6 < 7 so small that the first term is less than }e. Also for s(P2, P;) 
sufficiently small the second term on the right of (3.13) can be made less than 
te. This shows, finally, that the sequence 7;[u,](P) is equicontinuous, uni- 
formly for P in D + B. By Ascoli’s theorem (1), the sequence contains a 
uniformly convergent subsequence with a continuous limit. That is, 7,[u] 
is a compact operator in C. 
Next we demonstrate that for 0 < k < 1, the equation 


(3.15) u(P) = T,[u](P) 
has no solution lying on the boundary 0’. Since for any solution, 


max u = max 7,[u] = M, 


~~ i O08 OOO Oe 


—_-, 


gee 


QS ——— 


—- — Sie, oe 


—C = eer 


MODIFIED BOUNDARY PROBLEMS 209 


we see from (3.12) and the condition my < 1 that 
u(P) < M <2M/m, = K. 


Since A was defined by (3.9) for this K, we see that if t,[u] < 0, then 
2M 


— @ —- — 
mo 


|A-M|< <M \A — M| < Moty{u] 
0 


< t,[u] vo(P) 


< Jew. Q) F(Q, ku(Q)) dV + telus] v0(P) 
= T,{ul(P) = u(P). 


Hence the strict inequality on the left holds in (3.12) for any solution and so if 
t,[u] < 0 no solution can lie in Q’. If t,.[u] > 0, then 7,[u] > 0 and the same 
conclusion follows at once. 

Now for k = 0 the equation (3.15) has a unique solution since the operator 
To[u] is then independent of u. (Thus the mapping v = u — T,[u] is a 
homeomorphism). In fact the solution u for k = 0 is the solution of Au = 
— F (P,0), with max u = M and u(p) = t f(p). 

From the Schauder-Leray theorem we may now conclude that (3.15) has a 
solution for each k, 0 < k < 1. For k = 1, we observe that in view of (3.7), 
(3.15) becomes equivalent to the integral equation (3.6). Thus the solution 
u(P) for k = 1 satisfies (1.1) and has boundary values ¢ f(p). From (3.8) and 
(3.15) it follows that its maximum value is M. This completes the proof of 
the theorem. 

Two minor extensions of this result will be noted here. First, we can treat 
the case where F(P, u) is only bounded below: 


F(P,u) > — Ky 


by taking as a new variable @ = u + v, with v the solution of Av = — K, 
which vanishes on B. Second, we may replace the boundary values ¢ f(p) by a 
more general continuous function f(p,¢) which is strictly increasing with ¢ 
and tends to + © with /. 


4. Qualitative behaviour of the boundary values. The theorem of the 
preceding section would be of comparatively small interest if it were possible 
to solve the conventional Dirichlet problem which concerns the existence of a 
solution with given boundary values. We show that this problem is not solvable 
for the class of non-linear equations here considered. 

Let A; be the lowest Dirichlet eigenvalue of D for Au + \u = 0, and let 
the corresponding eigenfunction be denoted by u;. From (4, vol. I, ch. VI, 
§6) we see that ; is of one sign in D, say non-negative. Hence the outward 
normal derivative 0u,/dn is non-positive, and also does not vanish identically. 








210 G. F. D. DUFF 


Now let u be any solution of (1.1) with boundary values ¢ f(p), and let us 
suppose that 
(4.1) F(P, u) > AW 


for all values of u. Then the value of t is necessarily negative. 
This assertion follows readily from Green’s formula, since 


tf fumis = Jw Uin — Unt) dS 
(4.2) = f (uAu; — u,Au) dV 


= f utr, u) — Ayu] dV. 


Since u; > 0 in D the integral on the right is positive and since §,u,, dS < 0 
we conclude that ¢ < 0. If in (4.1) the equality sign is permitted we would 
find ¢ < 0; the case 


(4.3) F(P, u) = $di(u + |u}) 


illustrates this possibility. 

Thus, if (4.1) holds, (1.1) can not have any solutions with positive boundary 
values. This shows that the conventional Dirichlet problem for (1.1) is 
impossible. Since in the physical interpretation of heat generation one would 
expect F(P, u) to be a rapidly increasing function of u as u— + o, it seems 
worthwhile to find the closest analogue of the conventional Dirichlet theorem 
for such equations. Though Theorem I is not the only variant which might be 
considered, it has physical meaning since: 

(a) the maximum temperature is prescribed. 


(b) the distribution (or ratio) of temperatures on the boundary is pre- 
scribed, so that if the actual boundary value is known at one point, all other 
boundary values are determined. 

We continue the qualitative discussion of the values of ¢. If (4.1) holds 
only for 


we have 
(4.5) ‘<= » mo = min f(p), 
mo peB 


since otherwise we should have ¢ f(p) > uo and, the minimum value of a 
solution of (1.1) being assumed on the boundary, this would lead to 


u> tuo in D. 


But then (4.1) and (4.2) show that ¢ < 0, which is a contradiction. 
If we regard ¢ as a function of M for fixed f(p), we can show that ¢ is a 


~~, — 





__ 





MODIFIED BOUNDARY PROBLEMS 211 


continuous function of M. This follows from the Schauder-Leray theorem if 
we consider the functional 


T,[u)(P) = Ti,«[u](P) 


in its dependence on M. We need only choose the domain @ so that M is free 
to vary in a small interval and so that no solution of u(P) = 7;,y[u](P) can 


cross the boundary of &. The reader will readily be able to supply the details 
here. 


We now show that if 
(4.6) F,(P, u) < Ai, u < M, 
then t is a monotone strictly increasing function of M for M < M. This will be 
established by finding a contradiction to the contrary assumption, which is 


that there exist M, and Mz, M; < M; < M, such that t; > ts. Let u and us 
be the respective solutions. Then w = u: — m% satisfies 


Aw = — F(P, uz) + F(P, u) 
= — (U2 — mu) F,(P, u: + 0(P)(u2 — u)) 
= — w F,(P, us), 
say. Here u; is intermediate in value to u; and u2, so us < M. Since 
WwW = Up — u = (tp — th) f(b) < O on B 


and w > M; — M > Oat the maximum of u:2, there exists a domain D,; C D 
wherein w is positive, and such that w = 0 on the boundary B, of D,. Let 
’ be the lowest Dirichlet eigenvalue of D,; then (4, vol. 1; ch. VI, §6) we have 


Ai < X’ since D; C D. Let u;’ be the corresponding eigenfunction; we see as in 
(4.2) that 


0= f u;'w[F,(P, us) — ’] dV, 


and this is a contradiction since u,;’ > 0, w > 0 and F,(P2, us) < A, < X’ in 
D,, no one of the three factors vanishing in any open subset of D,. This proves 
the results stated. 


For example, if 


0, u<0O, 
F(P,u) = uw", u>0 n> l, 


we see that (4.1) holds for 
u > uo = ne» 


and so an upper bound for ¢ is known. For M = 0 the solution u = 0 fulfills 
the conditions of Theorem I with ¢t = 0. Since (4.6) holds for 


1/(a—1) 
u< (») ’ 
n 











212 G. F. D. DUFF 


we see that ¢ increases and is positive for 


The behaviour of t as M — © seems difficult to determine. 


5. Related eigenvalue problems. The theorems of this section differ 
from the preceding result in that the parameter ¢ appears in the differential 
equation instead of the boundary condition. They have therefore the character 
of eigenvalue problems, although the conditions to be fulfilled by the solution 
include the assigning of boundary values. 

Let F(P,u) be a continuous positive function, bounded away from zero: 


(5.1) F(P,u) > 6 > 0, 
and consider the problem of finding a solution of 
(5.2) Au = — tF(P, u) 


with given boundary values f() and a given maximum M. Let us assume that 
f(p) is C' with maximum 


(5.3) Mo = max f(p). 


Then without loss of generality we may take 
(5.4) M> Mo, 


since in any case M > M, is necessary, while if M = Mo, we may take ¢t = 0 
(4.2) and find a harmonic solution of the problem. 
Since a solution of the problem satisfies the integral equation 


(5.5) u(P) = tf ce, Q) F(Q, u(Q)) dVo + »o(P), 


where v(P) is again harmonic with boundary values f(p), we define the 
new operator 


(58) Ty'[ul(P) = t'lul f GOP, Q) FQ, ku(Q)) dVo + m0(P), 


with the choice of ¢ governed by the condition 
(5.7) max T;'[u](P) = M. 
PeD 


To show that this choice is possible we note that the non-negative integral 


(5.8) fev. Q) dV¢ 


has a maximum Gp» say for P = P, in D. Now for ¢t = 0 the right side of (5.6) 
is less than M; consequently /,'[u] must be positive. As ¢ increases, so does the 
expression on the right in (5.6). However at P = Py» we have 


t8Go < 16 fw. Q)dVe 
D 


| 
( 
| 
| 
( 
( 


~_- 


ii — 





-_— 


MODIFIED BOUNDARY PROBLEMS 213 


< tf cr, Q) F(Q, ku(Q)) dV 
D 
< M — v(P»). 


Let us denote by my the minimum of f(p), then by the maximum principle for 
harmonic functions 


mo < vo(P), PED, 
and so we find 
(5.9) 0 < 4,'[u] < (M — |my|) Ge". 
Since ¢,'[u] is positive, we have 
mo < vo(P) < T,'|u]} 
and therefore 7,'[u] has the bounds 


(5.10) my < T,'[u] < M. 
We now choose for 2 the connected region of C: 
(5.11) Q:m—e<u<M+e, 


and consider the equation : 
(5.12) u = T,'|u), 


for 0 < k < 1. That 7;,'[u] is jointly continuous in k and wu is evident on 
inspection. To show that this operator is compact, we select from any bounded 
set of functions a subsequence {u,} such that ¢,'[~,] converges to a limit. 
This is possible on account of (5.9). A proof similar to that in the preceding 
sections shows that 


J G0, @ FCO, u(Q)) dV 


is compact, and the result follows if we consider the subsequence {1,}. 

For 0 < k < 1, we see from (5.10) that (5.12) has no solutions on 2’, since 
this would contradict (5.11). For k = 0, T,'{u](P) is independent of u and so 
(5.12) has a unique solution. The Schauder-Leray theorem now shows that for 
k = 1, (5.12) has a solution. Thus the integral equation (5.5) has a solution 
u(P) with maximum WM, and this establishes the result, which we state as 
follows. 


THEOREM II. There exists a solution for suitable t of 
Au = — tF(P, un), F>sé6>0, 


with assigned boundary values f(p) < M and maximum M. 


The proof shows that the minimum value of the solution is attained on the 
boundary, and so is equal to m; however this could be deduced from the 
differential equation given that ¢ is positive. 

From our next theorem we insert the parameter ¢ with the dependent variable 








214 G. F. D. DUFF 


u in F(P,u). This requires a different set of conditions to be satisfied by 
F(P, u), namely 


(5.13) F(P,0) =0 
and 

(5.14) F,(P,u) > 6> 0. 
Thus we consider the differential equation 
(5.15) Au = — F(P, tu), 


and look for a solution with maximum M and boundary values f(p) where 
(5.16) 0 < m < f(b) < Mo < M. 


The necessity of these restrictions will appear; meanwhile we remark that the 
case M, = M can be solved for ¢ = 0 with a harmonic solution. 
The appropriate integral equation is now 


(5.17) u(P) = J G(P, Q) F(Q, tu(Q)) dVe + m0(P). 


We shall supply the parameter & in front of the integral, but this leads to a 
minor difficulty which suggests the addition of a further term. We define 


(5.18) T,'[u)(P) = & J GCP, Q) FCO, tu(Q)) dV0 + CU — b) t+ vol), 


where 
2C = 5myGo, 


and G» is again the maximum value of the integral (5.8). For 0 < k < 1 the 
right side of (5.18) is an increasing function of ¢, and we can choose ¢ = ¢,?[u] 
so that 


(5.19) max 7,2([u] = M. 


Since the first two terms in 7,” have the sign of ¢, and since v9(P) < M, it is 
evident that /,”[u] must be positive. Thus for 0 < k < 1, T,?[u] will have the 
lower bound mp, since my < v9(P). We therefore define the region Q of function 
space C as 


(5.20) 2:0 < $m, < u(P) < K, 


where K is a large positive constant as yet not fixed, but which exceeds M. 

To show that 7;,? is completely continuous in 2 + Q’ we need a uniform 
bound for #,?({u], « € 2 + Q’. To find this, we take the point Py» where (5.8) 
has maximal value Gp» > 0, and note that for u € Q, F(P,u) > 46m». Then 


M > T,2[u] > $ktimyGo + cil —k)t + mo 


= $5moGot + mo 
according to the definition of C in (4.18). Thus for uw € Q, we have 
(5.21) 0 <t,"[u] <2 M_— mo 


5moGo 


eee — we 


—— 





| 
\ 
( 
Q 


—— 


— 





MODIFIED BOUNDARY PROBLEMS 215 


The conclusion now follows quickly from the Leray-Schauder theorem. 
The equation 


(5.22) u(P) = T,*({u](P) 
has no solutions on Q’ for 0 < k < 1, since 
4my < my < T,*[u] < M < K. 


For k = 0, the operator 7,’ is independent of u, so that a unique solution 
exists. Thus for k = 1 the conclusion follows that (5.22) has a solution. From 
(5.18) we see that (5.17) is then satisfied. 


THEOREM III. Let F(P, u) satisfy (5.13) and (5.14). Then there exists for a 
suitable value of t a solution of 


Au = — F(P, tu) 


with assigned maximum M > 0 in D + B and given boundary values f(p) < M 
on B. 


We note that F(P,u) = Ayu, where \,; is the lowest eigenvalue as in §4, 
yields a counterexample to the solvability of the conventional Dirichlet 
problem for this equation, since an orthogonality condition is necessary. 

We conclude this section with a similar theorem for the equation 


(5.23) Au = — F(P,u) — tp(P). 


Again the solution is to have a given maximum M and boundary values 
f(p) < M. The detailed assumptions are as follows. We take for F(P, u) the 
restrictions 


(5.24) F(P,u) > — Fo 

and 

(5.25) F,(P, u) > 0, 

while the coefficient of ¢ on the right in (4.23) must satisfy 
(5.26) e(P) > po > O. 


The integral equation of the problem is 


(627) u(P) = f GIP, Q) [F(Q, w(Q)) + (Q)]4Vo + m(P), 


and so, defining 


(5.28) -R(P) = fev. Q) p(Q) dVe >0, 


we set 


(6.29) Tr4ul(P) = & f GP, Q) FCO, u(Q) dVo + iR(P) + 0(P). 








216 G. F. D. DUFF 


The choice of ¢ = #,*{u] is again governed by 

(5.30) max 7,*{u] = M. 
For the domain @ we take 

(5.31) Q:-K<u<M+e, 


where K is a large positive constant. Now for u € @ we have from (5.24) and 
(5.25) a limitation for F(P, u): 


(5.32) |F(P, u)| < A. 


Since F(P, u) is bounded as u — — o, A is independent of K. 

We now obtain bounds for ¢ = #,*(u]. Since vo(P) < M, the first two terms 
together in (5.29) must be somewhere positive. Since G(P, Q) is a non-negative 
kernel, this implies that the integrand 


kF(Q, u(Q)) + te(Q) 
is somewhere positive. Hence at some point Q,, 
tp(Qi) > — RF(Qi, u(Q)) > — kA 


and so 
t> — kA/po. 


This furnishes a lower bound for ¢. An upper bound may be found if we note 
that at the point P, where R(P,;) = R, is maximal, we have 


Ri < M—k f GFav—% 
D 


< M + kGoFo — Mo. 


Thus 
(5.33) —A/po <t < (M + GoFo — mo)/Ri, 


and these bounds are independent of K. 
The necessary lower bound for 7;*{u] is obtained by taking lower bounds 
for each term. Thus 


(5.34) T? [u] > — FuGe — oR iene 


where mp is a lower bound for f(p). This lower bound (4.34) is independent 
of K and so if we choose 





K = o( FG + ~ + ims) , 
0 
then the equation 
(5.35) u = T,*[u} 


will have no solutions on 2 for 0 < k < 1. For k = 0 there is a unique solution 


ON ys QS wr ~~ _—=_— waaay 


Oe 











a ee ae i ee 


a 


Pie a 


MODIFIED BOUNDARY PROBLEMS 217 


as before. For k = 1, there must accordingly exist a solution and from (5.29) 
we see that (5.27), is satisfied for a certain value of ¢. The maximum condition 
(5.30) also holds and the solution of the problem is thus completed. 


THEOREM IV. Let F(P,u) satisfy (5.24) and (5.25), and let p(P) satisfy 
(5.26). Then there exists for a suitable value of t a solution of 


Au = — F(P,u) — tp(P), 


with assigned maximum M in D + B and given boundary values f(P) < M 
on B. 


The various conditions imposed on F(P, u) in these theorems can be slightly 
relaxed in various ways. However it is to be noted that the conditions of 
Theorem III exclude all functions F(P,u) satisfying the restrictions of the 
other theorems. 


6. A modified Neumann problem. As an illustration of the way in 
which this method of proving existence theorems can be applied to other types 


of boundary condition, we include here a modified Neumann problem for the 
equation 


(6.1) Au — bu = — F(P, 4), 6> 0, 
where 
(6.2) F(P,u) > — Fo, F,(P, u) > 0. 


The boundary condition shall be 


Ou 
(6.3) | hen go(p) + tgi(p), 


for some value of t. We take go(p) and g:(p) to be C' with 
(6.4) gi(p) > 0. 


The usual maximum condition max u = M shall hold. 
The Neumann function N(P, Q) of the linear equation 


(6.5) Au — du =0 
may be written as 
(6.6) N(P, Q) = G(P, Q) + K(P, Q), 


where G(P,Q) is the Green’s function, and K(P,Q) the Bergman kernel 
function, of (6.6). (2) We shall need the complete continuity in the space C 
of the operator with kernel N(P, Q); this will be established by showing that 
the operators based on G(P,Q) and K(P,Q) are completely continuous. 
Indeed the proof for G(P,Q) is the same as in §3. Now let us write down 
Green’s first formula on D with argument functions K(P,Q) and 1. Since 
K (P, Q) is a solution of the differential equation, we get 








218 G. F. D. DUFF 


five-vi + 6K-1)dV = fi %as. 
D B on 


The right hand expression is the solution of (6.5) with boundary values 1, and 
so is less than or equal to 1 in D. Thus we find 


fxe, Q)dV <a"; 
D 


this integral is uniformly bounded in D + B. We also note that K(P, Q) is 
non-negative (2) in D + B. A calculation of the kind given in §3 now leads to 
the complete continuity of the operator based on K(P, Q). Further details are 
here omitted. 

The integral equation of the problem is 


67) u(P) = f NCP, Q) FCO, u(Q)) dV + t(P) + (P), 
where for i = 0, 1 we have 
6.8) u(P) = f NCP, @) eda) dS, 


Since N(P, Q) > 0 it follows from (6.4) that »,(P) > 0, and we denote by 2; 
and V, positive lower and upper bounds: 
0 <2, < 2,(P) < Vi, 

while similarly choosing bounds for vo(P): 

Vo < vo(P) < Vo. 

The operator T for this problem will now be defined as 
69) Tylul(P) = f NCP, Q) FCO, ku(Q)) dVq + tolP) + 1(P), 
while ¢ = i,[u] is fixed by the condition 
(6.10) max 7,[u] = M. 
Setting 
Q = {ul-—-K<u< M+ ¢}, 


we find that for u € 2, F(P, u) satisfies an estimate 








(6.11) F(P,u) < A. 

Then ¢ = i,[u] has the bounds 

(6.12) |NoA +) “= M| <i< M— nt NoFo ‘ 
0 0 


We therefore choose —K less than »~'|N,»A + V; — M|, which is possible 
since this quantity is independent of K. The equation 


= T,(u) 








Da 


le 


MODIFIED BOUNDARY PROBLEMS 219 


now has no solutions on 2’ for 0 < k < 1; and a unique solution for k = 0. 
The result now follows as before. 


THEOREM V. There exists a solution of (6.1) which satisfies the boundary 
condition (6.2) for some t, and has maximum value M. 


As in Theorem I, the right side of (6.3) could be replaced by a more general 
increasing function of ¢. Corresponding results for the Dirichlet and Robin 
boundary conditions and this differential equation can be established along the 
same lines of proof. 


In conclusion we note that the uniqueness of solutions in all of these results 
has not been established. 


REFERENCES 


1. S. Banach, Théorie des opérations linéaires (Warsaw, 1932). 
2. S. Bergman and M. Schiffer, Kernel functions in the theory of partial differential equations of 
elliptic type, Duke Math. J., 15 (1948), 535-566. 

—, Existence theorems for some quasilinear partial differential equations, ONR report 
NRO 43068, (1952). 

4. R. Courant and D. Hilbert, Methoden der mathematischen Physik (Berlin, Vol. 1, 1931; 
Vol. II, 1937). 

5. J. Leray and J. Schauder, Topologie et équations fonctionnelles, Ann. Ec. Norm. Sup., 61 
(1934), 46-78. 

6. L. Lichtenstein, Vorlesungen iiber nichtlineare Integralgleichungen (Berlin, 1931). 





University of Toronto 








SOME ALGEBRAIC PROPERTIES OF ASYMPTOTIC 
; POWER SERIES 


T. E. HULL 


1. Introduction. Let us consider all power series of the form 
Cot C12 + coe? +... 4+ e,2"4+.... 


It was shown first by Borel (1) that to each such series there corresponds a 
non-empty class of functions such that each function in the class has the given 
series as its asymptotic expansion about z = 0, the expansion being valid in a 
sector of the right half z-plane with vertex at the origin. Various generaliza- 
tions of Borel’s theorem have been given by Carleman (1), van der Corput 
(2), and Erdélyi (3). 

We shall be interested only in the case where the c, are real and where z 
is a real, non-negative variable x. We are then led to the following special case 
of Borel’s theorem. To any series 


Co + Cx + cox? ++... 4+ ¢,x"+..., 
there corresponds at least one function f(x) such that 


R, (x) x" = o(x*"'), x0, 
where 


R, (x) x* = f(x) — Co — ye — Cox? 22. — Gi, n= 1,2,3,..., 


is the remainder after m terms. 

Because of Borel’s theorem the expressions ‘“‘asymptotic power series” and 
“formal power series’’ are equivalent; we shall refer to them as “asymptotic 
series’’ or simply as “‘series.’’ We shall refer to the class of all sum functions 
f(x) corresponding to a particular series as the asymptotic sum of the series. 

It is obvious that the collection of all asymptotic series forms a ring under 
formal addition. subtraction, and multiplication and it is known (4) that this 
ring is isomorphic to the ring of all asymptotic sums. 

It is the purpose of this paper to discuss, using primarily algebraic notions, 
some of the properties of these rings. To do so we pay particular attention to 
the fundamental role played by those special asymptotic series for which 

(i) oo > 0, 

(ii) there exists a sum function f(x) such that, for all x > 0, |R,(x)| < |c,| 
(n = 0,1, 2,...), where Ro(x) = f(x) and otherwise R,(x) is defined as above. 

Condition (ii) means that the remainder, with respect to f(x), is numerically 
less than the first neglected term. Any series satisfying the properties (i) and 


‘ 





Received June 15, 1955. This research was supported by the United States Air Force, 
through the Office of Scientific Research of the Air Research and Development Command. 


220 








ASYMPTOTIC POWER SERIES 221 


(ii) will be referred to as an S-series. Such series often arise in physical prob- 
lems and because of their remainder property are especially useful in compu- 
tations. 

Our plan is to show first that the collection of all S-series is closed under 
formal addition and multiplication (but not subtraction). Since the distribu- 
tive law will hold too, we shall call such a collection a semiring. Then we shall 
show that the full ring of all asymptotic series is generated from this semiring 
when we adjoin all differences to the semiring. 

We may mention that such an imbedding of a semiring in a ring can arise 
in other contexts. The simplest of these is the imbedding of the semiring of all 
integers greater than or equal to some fixed non-negative number in the ring 
of all integers. 


2. The S-series form a semiring. We proceed now to prove the first of 
our two theorems. 


THEOREM 1. The S-series form a semiring. That is, the formal sum or product 
of two S-series is an S-series and the distributive law holds. 


We show first that the coefficients in an S-series must alternate in sign 
unless the series consists of only the constant term. Suppose that c, > 0 
(n = 0,1, 2,,...). Then, since 


R, (x) = ¢ + Ra+i(x) x 


and 

IRx(x)| < eal, 
we obtain 

Rasi(x) x < 0, 
so that 

Razi(x) < 0. 
Moreover, 


Ra+i(x) = Cn+1 + Ra+2(x) x 
and, letting x — 0, we obtain 
Razi(O+) = Casi, 
so that 
Cari < 0. 


Similarly, if c, < 0, we obtain ¢,4; > 0. 

We have still to show that the coefficients must all be non-zero except in 
the special case where the series consists of only the constant term. We 
obviously cannot have any coefficient equal to zero unless the series terminates; 
but the series cannot terminate with the term c,x" (m = 1, 2,3,...), because 
if it did we would have 


R,~-1(x) = Ca-1 + CrX, 








222 T. E. HULL 


and x could always be chosen so large that 
|Rp-1(x)| > lcn—l. 


From now on we shall denote the non-constant S-series by, for example, 


f® = a9 — a X tae x? —... + (—1)*, 2" + SO(-'19"* RY 
and 

fP = Be — Bix + Bsx* —... + (—1)*"'B-1 2" + (—1)° BSF x", 
where a, 8B, > O and R,*, Rk,’ > 0. 


The Theorem requires us to show that the sum or product of two S-series 
is an S-series. The “sum” part of the proof is trivial. The “product” part is 
also trivial if one or both of the series is constant; for the other case we note 
that the remainder, with respect to f* f*, after m terms in the formal product of 
the above two series can be written 


(-—1)*" (ao Rf + a R,-1° +...+ Qn-1 Ri + R.° Ro) xt 
while the (” + 1)th term in the formal product is 
(—1)**" (a8, + arBaa + . ~~. + aegBo) x**". 


The first of these two expressions is numerically less than or equal to the 
second so that condition (ii) is satisfied. The first term in the formal product is 
ao8> > 0 and so condition (i) is also satisfied. The product series is therefore 
an S-series with respect to the function f* f*. It is obvious that the distributive 
law holds and so the Theorem is proven. 

In fact we have shown that the semiring of all S-series is isomorphic to the 
semiring whose elements are the classes of sum functions which satisfy condi- 
tion (ii). The semiring possesses a unit and a zero element which are simply 
the numbers | and 0 respectively. 

It can also be shown that the formal substitution of an S-series in place of 
the variable in a convergent series produces another S-series, provided the 
coefficients of the convergent series are positive and its radius of convergence 
is greater than the constant term in the first S-series. 

Incidentally, the non-constant S-series alone form a semiring without, of 
course, either a unit or a zero element. This semiring is an ideal, if differences 
are not allowed, in the larger semiring of all S-series. 


3. The semiring generates the ring. We shall now show that the full 
ring of all asymptotic series is generated from the semiring of all S-series 
when we adjoin all differences to the semiring. The result can be formulated 
in the following way. 


THEOREM 2. Any asymptotic series can be written as the difference between two 
S-sertes. 








wo wo MV 


\ 


ASYMPTOTIC POWER SERIES 223 


Suppose that ao, a; > 0 and consider the series 
ao — a1 X + axe? — ... + (—1)" a,x"... 


We shall show shortly that, if a,4:/a,—~ © as n — © and the inequalities 
Cn+1/ On > On /Cn—1, 1= F 2, 3, ae | 


are also satisfied, the series is an S-series. The proof of the Theorem is then 
straightforward ; for, given any series 


Cot eaxtcox? +... +¢,x°+..., 


one can always choose some ao, ai, 80, 8: > 0 so that ap — Bo = co and 
a, — 8; = — ¢;. Then one can always choose pairs a2, 8: > 0, a3, 8; > 0,... 
in turn so that a_4:/a_ and Bysi/B, © as n >, and 


An+1 an Bust B, 


Gy ~ Ont Be ~ Bas’ hee dies 


and so that a, — 8, = (—1)"c,. The a-series and the §-series so formed are 
then both S-series and their difference is the given series. The Theorem is then 
proven. 

We have only to show that the conditions assumed for a, in the above 
paragraph ensure that the corresponding a-series is an S-series. For a series 
to be an S-series, conditions (i) and (ii) must be satisfied. Condition (i) is 
satisfied since we have assumed that a» > 0; in fact our assumptions guarantee 
that all a, > 0. We can show that condition (ii) is also satisfied by constructing 
the required ‘‘sum”’ function. 

We define the intervals J, in the following way: J, is the interval 0 < x < @ 
and J, (n = 1, 2,3,...) isthe interval0 < x < apg_:/a,. Putting 


mts) = 40, “vere 


we define 


f(x) = > (-1)' ade) a,x’. 


This series converges for all x—in fact, it terminates for each x. Therefore 
f (x) is defined. 

For our purposes the essential points are the following. For each x, the 
terms which appear in the series for f(x) decrease in magnitude with increasing 
subscript (unless only the first term appears). The terms which do not appear 
in the series for f(x) are non-decreasing in magnitude with increasing subscript. 

Then, if we suppose that x € Iy — Ivy: (N = 0,1, 2,...) and if we take 
account of the fact that all terms are alternating in sign, we can easily pick 








224 T. E. HULL 


out one term which dominates R,,;(x) x**' (n = — 1,0,1,...). We obtain 
< Gap x**!, n < N, 
|[Raga(x) x***|) = 0, n=N, 
l< A,X", n> N. 


\ 


The last expression is, in turn, <a,,,x"**' when n > N, so that condition 
(ii) is satisfied. (If N = © we of course need to consider only the case where 
n < N,andif N = 0 we need to consider only the cases where n > N.) 

The conditions assumed for a, are not necessary for the series to be an 
S-series; this can be seen by considering the expansion of e~*. Moreover it 
can be shown that the S-series which do satisfy these conditions are closed 
under addition but not under multiplication. 

Incidentally we have in fact shown that the semiring of non-constant 
S-series also generates the full ring with the adjoining of all differences. 


4. Concluding remarks. D.C. Murdoch has pointed out that the above 
results enable one to define a partial ordering on the ring of all power series. 
One series a(x) can be defined to be “‘greater than or equal to’’ another series 
b(x) if and only if their difference is an S-series. By using a procedure analogous 
to that used in the first part of Theorem 2, one can then always construct 
an upper bound and a lower bound to any pair of series. However, it is also 
possible to show that neither the least upper bound nor the greatest lower 
bound required for a lattice can exist. 

Algebraic and other properties of asymptotic series have been considered 
by Popken (5). He discusses the ring of all asymptotically finite functions 
(and so does not restrict his attention to power series) and he shows, for 
example, that this ring is complete with respect to a certain non-Archimedean 
pseudo-valuation. 

We wish to thank B. N. Moyls for many interesting discussions during the 
preparation of this paper, and also the referee, particularly for pointing out an 
error in the proof of Theorem 2. 


REFERENCES 


1. T. Carleman, Les fonctions quasianalytiques (Paris, 1926). 

2. J. G. van der Corput, Asymptotic expansions I. Fundamental theorems of asymptotics (Univ. 
of California, Berkeley, 1954). 

3. A. Erdélyi, Asymptotic expansions (Cal. Tech., Pasadena, 1955). 

4. H. Poincaré, Sur les intégrales irreguliéres des équations linéaires, Acta Math., 8 (1886) 
295-344. 

5. J. Popken, Asymptotic expansions from an algebraic standpoint, Nederl. Akad. Wetensch. 
Proc., Ser. A., 56 (Indagationes Math., 16 (1953), 131-143). 


University of British Columbia 








’ 








~ 





ASYMPTOTIC EXPANSIONS 


LEO MOSER AND MAX WYMAN 


1. Introduction. Let a; a2, ..., a, be a set of real non-negative numbers 
and let 
1.1 P(x) = ayx + age? +... +4," (an ¥ 0). 


Many combinatorial problems can be reduced to the study of numbers B, 
generated by 


1.2 > B,x"/n! = 
n=0 

Some problems of this type were treated by Touchard (7), Jacobsthal (3), 
Chowla, Herstein, Moore and Scott (1; 2), and the present authors (4). 
In (2), the problem of finding asymptotic formulae for B, in terms of P(x) 
was proposed. Essentially the same problem was solved earlier by Pélya (5), 
as a by-product of an investigation of the zeros of the derivatives of certain 
functions. The object of the paper is to give a different and more explicit 
solution to this problem. Furthermore our method yields complete asymptotic 
expansions, while that of Pélya gave only the first term. 


2. Preliminary notions. Since some of the coefficients a, in (1.1) may be 
zero, P(x) will in general have the form 


2.1 P(x) = dyx’ + box +... + 4,,x", 

where the coefficients },, b2,...,@, are positive. In what follows we shall 
assume 

2.2 Fp Bp cece tt) @ 1. 


This involves no essential loss of generality since one can always reduce the 
problem to this case by a substitution of the form y = x*. 


LemMA 1. If 0 < 6 < x and cos ré = 1, cos 88 = 1,...,cos m0 = 1 then 
6 = 0. 


Proof. If 0 <@< - then cos r@ = 1 implies the existence of positive 
integers a and 5, (a,b) = 1, 6 > 1, such that @ = x a/b. Now r@ = rx a/b, 
sO = sx a/b,..., m0 = mx a/b are each integral multiples of 27. Hence b 
divides 7, s,..., m, which contradicts (2.2). 

Corresponding to P(x) as defined in (1.1) we define a trigonometric poly- 
nomial S(R, @) by 


Received July 28, 1955. 








226 LEO MOSER AND MAX WYMAN 


2.3 S(R, 0) = 4[P(Re”) + P(Re™)] = > a,R’cos k 8. 
k=l 

Further we define « by 

2.4 e = Rii-™/s, 

and prove 


LemMA 2. For e < @ < x and R sufficiently large, S(R, 0) < S(R, €). 


Proof. Since a, > 0 fork = 1,2,...,m. S(R, @) assumes its greatest value 
at @ = 0. Also by (1.1), (2.3) and (2.4) we have 


2.5 S(R,0) — S(R, €) = > aR*(1 — cos ke) = oR’). 


On the other hand, there exists by (2.2) and Lemma I, a positive integer 
t < m such that cos(#@) * 1 and a, ~ 0. Hence for fixed 6, 


2.6 S(R, 0) — S(R, 6) = > aR*(1 — cosk@) > CiR‘, 


k=l 


where C, is a fixed positive constant. Comparing (2.5) and (2.6) gives the 
required result. 


3. Asymptotic formulae. By (1.2) and Cauchy's theorem 
! P(2) 
3.1 B, = af <a dz, 


where c denotes the circle z = Re**. We note that R, the radius of the circle, is 
arbitrary. From (3.1) we obtain 


3.2 B, =A f e ™ 06, 

where 

3.3 A =n! e?\® /2eR", 

and 

3.4 F(R, 6) = P(Re*) — P(R) — iné. 
Let I be defined by 

3.5 I= f e® G9, 


where « is given by (2.4). 
Lemma 3. |J| = O(exp(—R’)). 
Proof. Clearly 


Z| < forty, 


and the required result follows from (2.5) and Lemma 2. 








ASYMPTOTIC EXPANSIONS 227 


Since we will show that the integral in (3.2) can be expanded in powers of 
1/R, we may neglect integrals of type (3.5) and write 


3.6 B,~A I Fs. 

Our next step is to expand F(R, @) in a Maclaurin series of the form 

3.7 F(R, 0) = > C,(R) (0)? /j!, 

where 

3.8 C,(R) = > kak — n, 

and 

3.9 C,(R) = > bak (j > 1). 


At this stage we choose R so that 


For large m, (3.10) will have a unique solution which may be calculated by 
iteration starting with 


3.11 R ~ (n/m a,,)'"". 
When (3.10) holds, (3.7) can be written in the form 
3.12 F(R, 0) = — $C.(R)@ + > C,(R)(i0)’/j!. 
j=3 


In order to simplify some of the expressions which occur, we introduce the 
following notations: 


3.13 R=2, 2:=R-, 
3.14 C,(z) = 2"C,(e") = > kaye”, (j> 1), 
k=l 
3.15 f(z) = 2™-» 6, (2) {2/E2(2)}™, 
3.16 h = €(C2(R)/2)}, 
3.17 @ = 6(C.(R)/2)', 
3.18 H = A(2/C.(R))', 
3.19 . v(z, ¢) = > f,(2)(ie)’/j!. 
j=3 


If we now make the”substitution (3.17) in (3.6) and use (3.12) and (3.19) 
we obtain 


» 
3.20 B, ~uf eo Pegg. 
—r 








228 LEO MOSER AND MAX WYMAN 


From (2.4), (3.9) and (3.16) we see that for R large, 
3.21 K,R'* > } > K;R"", 
where K; and K; are fixed positive constants. Further, for R sufficiently 


large it is not difficult to show that there exists an interval -—7c <z<¢ 
for which ¥(z, ¢) and e#** have Maclaurin expansions in z of the form 


3.22 v(z, ¢) = > vi(¢) 2 

and 

3.23 ve = F v,(¢) 2’, Volo) = 1, 
r=0 


which are uniformly convergent for |¢| < A. It is further easy to justify the 
fact that the ¥,(¢) are given by means of (3.19) to be 

es) 1 a* ° | 
“ no=- = 1 | #42) (ig)? 


j=3 dz* z=0 J 
From (3.19) we see that ¥,(¢) are polynomials in ¢ and hence ¥,(@) are also 
polynomials in ¢. In fact V2,(¢) contains only even powers of @ while W2,4:(¢) 
only contains odd powers of ¢. 
Using (3.24) in (3.20) we have 


s—1 » 
3.25 B,~H > ( f v,(¢) as) a* + R,|, 
—r 


k=0 


ny ea) 
3.26 R, = f e* > ¥,(¢) 2'd¢. 
—Ar k=y 


Using (3.21) and the fact that the ¥,(¢) are polynomials in ¢, (3.25) yields 


s=1 


/ fo . 
3.27 B, ~ (f vid) eae) +R], 


k=0 


where R, is still given by (3.26). In order to complete our proof it remains 
to show that for fixed s, R, = O(z*). If this is so our complete asymptotic 
formula becomes 


3.28 B,~H| > f V,(¢) eae /R* |. 
k=0 —o 


Finally, in view of the remarks following (3.24), the integrals in (3.28) will 
vanish for odd k and (3.28) can be put in the form 


3.29 B, ~al > (f° ¥u($) as) ir | ; 


k=0 


We shall now consider ¥(z, ¢) as given by (3.19) to be a function of a com- 





— © a 





ASYMPTOTIC EXPANSIONS 229 


plex variable z and a real variable ¢. We restrict z to be in a neighborhood 

lz] <o <1 and @ to be bounded. Under these restrictions (3.14) yields 
m—1 | 

3.30 |4C2(s)| > | ma, — >> k’a,o"™ 
k=1 


Clearly, by taking ¢ small enough we may say that 


3.31 |4C2(z)| > a’, 


where a is a positive constant. 
Similarly from (3.19) and the fact that |z| < 1 we can say that 





3.32 IC,(z)| < >> k’la,| < a(m**"), 
k=1 
where a = max(|a;|, |ae|,..., |a,/). Hence from (3.15), (3.31) an (3.32) 
we obtain 
3.33 \f;(z)| < a(m*")/a’, 


where a, m, and a are independent of j and z. From (3.3) and Cauchy's theorem 
on derivatives we obtain 

| d* 
3.34 | PL) | < a(m’?*") k!/a’o*. 

| dz | gO 
By introducing the notation am = M, m/a = K, o = 1/S, (3.34) may be 
written 


| sk | 
3.35 GfM2)) KR! St. 
dz | gO 


From (3.15) the derivatives in (3.35) vanish for m(j — 2) > k. Since m is a 
positive integer, k + 2 > (k/m) + 2. Hence (3.24) can be written 


ate 3 [ eae) (ig)? 
3.36 ¥(¢) = Dy RiL de Ji jt’ 
From (3.35) we now obtain 
k+2 
3.37 \ve()| < MS* D0 (K\9\)’/3!, 
j=3 
and by induction on k we easily deduce 
3.38 ¥x(¢)| < MS* [K|o|]* [1 + Kol}. 
By a lemma proved in (4), (3.38) implies 
3.39 |W ,(¢)| < M[Klo|}*{1 + M(K@)*]*'S’(1 + Klol)’. 
From (3.39) we obtain 
340 | > ¥,(¢)2/| < = MIK|¢\F(0 + M(K¢)*)"S'(1 + Kl o!)"le|" 
= | 


where T is given by 


3.41 T = 1 — [1+ M(Ki/¢})*)[1 + Klol]lzl. 








230 LEO MOSER AND MAX WYMAN 


We now revert to real values of z. Recalling that z = R-! and |¢| < \ we 
have, from (3.21), that \ < K,R"’’, 


3.42 \o|*|2| = O(R-’*). 


Hence for R sufficiently large we have T > 4. Thus (3.40) yields 


| is) 
3.43 | > ¥,(¢) 2° < Q.(\¢}) 2’, 


| jus 


where Q,(|¢|) is a polynomial in |¢|. From (3.26) we now obtain 
A oo 
3.44 IR.| < J Patio doe’ <2 f e*'0.(\¢|) de. 
Since Q|\¢|) is a polynomial, the last integral of (3.44) exists and hence 
3.45 R, = O(z*). 


This completes the proof of the main result (3.29) which can be written 
in the form 


n! Pe 2 ] ~ ( ow ™ R) 
. nm Be _ - / 
3.46 B, oR" C,(R) > in V,(¢) é d¢; ’ 
where R is determined by 
3.47 > ka,R* — n = 0. 


In concluding this section we might point out that ¥o(¢) = 1. Hence the 
first term of the asymptotic expansion is easily calculated. If we introduce the 
operator 6 by 


3.48 eo=rs 


then the first term of the expansion is given by 
nie?™T } 

- Ba~ "RR L2ee'P(R) J’ 

and R as a function of n is given by 

3.50 OP(R) = n. 





4. Applications. To illustrate applications of the method we consider 
three special cases. 


Example 1. P(x) = x. 
In this case the numbers B, are all 1. However, since the asymptotic formula 
obtained by our method involves the factor !, it will lead in this case to 
Stirling’s expansion for n!. Equations (3.8), (3.9) and (3.10) yield R = n and 
C,(R) = R = n (j > 1). Applying (3.46) we obtain 


“—-" 





\ 
’ 


—_ 





ASYMPTOTIC EXPANSIONS 231 


n! e” l 
4.1 tw ott - i+...) 
or 
‘ Doms 2} ( a ) 
4.2 n! (s (2x n) 1+ 75, +-:: 
as required. 


Example 2. P(x) = x + (x?/p). 
In this case it is known (3) that if p is a prime then B, = B,,, is the number 
of solutions of x” = 1 in the symmetric group of degree n. The case p = 2 


was treated in (1) and (4) and the result for » > 2 was announced in (4). 
In this case we have 


4.3 P(R) = R + (R*/p), 

4.4 Ci\(R) = R+ R®? —n=0 
and 

4.5 C:(R) = R + pR’. 


From these and (3.49) the first term of the asymptotic expansion is given by 


 nlexp(R + R7p) 
4.6 Bas R"(2x(R rt pR’)}' ° 


Now using (4.2) and (4.4) we obtain, 





4.7 n!~n" e"(2xn)', 
4.8 exp(& } = exo” - R) 
Also 
n/p n/p n R 
4.9 R" = (n— R/S” =n en" loe(1 _ R)) 
p n 
Expanding log(1 — R/n) yields 
mn, gt? = & ) 
4.10 R n exp( > 2pn 
Finally 
4.11 (R + pR’)* ~ (pn)}. 
Using (4.7) to (4.11) in (4.6) yields 
n \"9-1”) s ( z) 
4.12 Buy ™ ( ") p *exp\| R + 2pm j 


We now consider two cases: 
Case 1. p = 2. Here e®+(#*/) ~ e-4+4 = exp(ni—}). 








232 LEO MOSER AND MAX WYMAN 


se2. p> 2. Here e®+(#*/™) ~ exp(n'”). 
Thus we obtain 


jn 
4.13 Ba. ™~ ( . ) exp(n') g4,-* 
and 
n n(l—1/p) 
4.14 Buy ™ (s ) ptexp(n*”) (p > 2). 


Example 3. P(x) = 2tx + x? (¢ > 0). 


Here B, = B,(t) are polynomials in ¢. In this case we have 





4.15 P(R) = 2tR + R?, 
4.16 OP(R) = 2tR + 2R? = n, 
4.17 R = 4[—t + (Qn + #)}). 
From these and (3.49) we obtain 
n! exp(2Rt + R’) 1 
“= B,(t) R" (2e(Qki+4R))’ 


where R is given by (4.17). 


By computing the first two terms of the asymptotic expansion, B,(¢) can be 
put in the form 


in 3 
4.19 B, (t)~ (2) 24 Yexp((2n)'t — {1 + tm 


Our method restricts ¢ to be positive. However, the above result is valid also 
for t < 0. B,(t) is of course related to the Hermite polynomials and (4.19) can 
be checked by means of the known expansion formula for these polynomials 
given in (6, p. 194). 


5. Conclusion. We have given here a method of finding asymptotic 
expansions for numbers or functions whose generating function is of the form 
e?®). In this paper we have restricted P(x) to be a polynomial in x with non- 
negative coefficients. If this severe restriction on P(x) is relaxed (3.49) may no 
longer be valid. We hope, in a subsequent paper, to show how the method may 
be modified to cope with the case of less restricted functions P(x). 














eee er 


ASYMPTOTIC EXPANSIONS 233 


REFERENCES 


1. S. Chowla, I. N. Herstein, and K. Moore, On recursions connected with symmetric groups |, 
Can. J. Math., 3 (1951), 328-334. 

2. S. Chowla, I. N. Herstein, and W. R. Scott, The solutions of x¢ = 1 in symmetric groups, 
Norske Vid. Selsk., 25 (1952), 29-31. 

3. E. Jabobsthal, Sur le nombre d’éléments du group symmetrique S,, dont lV ordre est un nombre 
premier, Norske Vid. Selsk., 21 (1949), 49-51. 


a 


(1955), 159-168. 


. L. Moser and M. Wyman, On solutions of x¢ = 1 in symmetric groups, Can. J. Math., 7 


5. G. Pélya, Ueber die Nullstellen sukzessiver Derivierten, Math. Z., 12 (1922), 36-60. 
6. G. Szegé, Orthogonal Polynomials, Amer. Math. Soc. Coll. Publications (New York, 1939). 
7. J. Touchard, Sur les cycles des substitutions, Acta Math., 70 (1939), 242-297. 


University of Alberta 








THE ASYMPTOTIC SERIES FOR A CERTAIN CLASS 
OF PERMUTATION PROBLEMS 


N. S. MENDELSOHN 


1. Introduction. This paper is concerned with problems connected with 
the permutations of the integers 1, 2, . . . , m subject to certain special restric- 
tions. One such class of problems, the so-called “‘card matching’’ problems, 
deals with conditions of the type, “the number is in the jth position,”’ “the 
number & is in the mth position,”’ etc. The given conditions need not be 
compatible, i.e. a meaningful problem results from having amongst the set of 
conditions such conditions as ‘1 is second,’’ “2 is second,” “2 is third.” In a 
permutation of 1, 2, 3,..., and a set of conditions S we will say that there 
are r “‘hits’’ if exactly r of the given conditions are fulfilled. Amongst the n! 
permutations of the numbers 1, 2,3, ..., , suppose there are N(r) in which 
there are r hits. The problem of determining N(r) has been treated in (3; 6; 
7). These results may be expressed in the language of probability by saying that 
M(r) = N(r)/n! is the probability of exactly r hits. 

A second type of problem deals with the so-called relative conditions, 
i.e., conditions such as “i immediately precedes 7." These problems are dealt 
with in much the same way as the previous type in (3), and for the purposes 
of this paper will not require a separate treatment. 

For a fairly large «.ass of problems of both types the distribution of M(r) 
is asymptotic to a Poisson distribution. In fact, in these cases, it is possible to 
write M(r) in the form: 


: ar Cy os cs ) 
(1) My) =<¢ r! af ee CK *ie-le-s*** ; 





In general the c,; are polynomials in r of degree at most 27. It is the purpose of 
this paper to show how to compute A, ¢;, ¢z,.... The determination of A, 
C1, C2,..., does not require a knowledge of the exact expression for N(r). 
It suffices to have a difference equation for a certain polynominal operator 
associated with the given set of conditions. The computation can be carried 
out completely from a knowledge of the coefficients which appear in the 
difference equation, together with the initial conditions necessary to fix the 
solution of the difference equation. 


2. The general problem. In what follows, the discussion will be confined 
to the card-matching type of problem although one of the illustrations given 
in the end will deal with a “relative condition’”’ problem. If p,,; denotes the 


Received July 22, 1955. 
234 


—_— - nN 








ie 


| 


ASYMPTOTIC SERIES FOR PERMUTATION PROBLEMS 235 


condition “‘i is in position j’’ and N(0) and N(r) denote the number of permu- 
tations of 1, 2,..., in which there are 0 and r hits respectively, the method 
of inclusion and exclusion yields the following formulae for N(0) and N(r): 


N(0) = > (—1)*dn.2(n — k)! 


and 


N(r) = > (-1 itr tua( *)(n — yt. 


In these formulae ¢,, represents the number of ways in which exactly k 
compatible conditions may be chosen from the set of all p,,. 

If M(O) and M(r) are the probabilities of 0 and r hits respectively, the 
relevant formulae are: 


M00) = > (-1)'e et 


k=0 n!' 


and 


M(r) = > (—1)**" @,, {? jes bt =n 


k=r n! 


If y(t) is the generating function of the number of hits, i.e. 


vi) = > M(r)¢, 


‘ 


we have: 


v(t) = yd yo ad * )e= My 


! 
ra bo mn: 


y* (n — k)! 
-> on.e(t — oa . 

The determination of ¢,, has been treated in (3; 6; 7). Perhaps the most 
interesting representation has been given by Kaplansky and Riordan in (6). 
In this representation, for each condition p,, the cell in the ith row, jth 
column in an (m Xm) chessboard is marked. It is easily seen that ¢,, is 
the number of ways of putting & non-attacking rooks on the marked squares 
of the board. This representation makes it easy in special cases to obtain 
explicit formulae for ¢,,., and in more complicated cases it simplifies the 
determination of recurrence relationships. 

Fréchet (2) gives a thorough discussion of the method of inclusion and 
exclusion on which the formulae of this section are based. 


3. The symbolic representation. By the use of the difference operator 
E, defined as E f(m) = f(m + 1) the formulae for N(0), N(r), M(0) M(r) may 
be expressed in the forms: 








236 N. S. MENDELSOHN 


N(O) = P,(E) f(0), 


N(r) = P,(E) g,(0), 
M(0) = P,(E) f*(0), 
M(r) = P,(E) g* (0), 
where 
P,(E) = De (—1)*n.2E", 
fi) = (n-2D)! 
ft : 
g(t) = [c—(‘)en - 9» t>r, 
lo, t<r, 
fw ==) 
( — pvt 
c-(‘)e= 9! t>r, 
*(t) = 1: 
& lo, :< & 


In (3; 7) methods of obtaining difference equations for P,(£) are given and ina 
number of cases these lead to explicit formulae. This paper is concerned mostly 
with a determination of the asymptotic series for M(0) and M(r) in the cases 
where the difference equation for P,(£) is of a special form. In a large class of 
problems discussed in the literature the polynomial P,(E£) does indeed have a 
difference equation of the required form. 


4. Some illustrative examples. A number of examples (mostly classical) 
are given here to illustrate in concrete terms the type cf problem with which 
this discussion is concerned. These examples have also served to verify the 
correctness of formulae which are developed later in this paper. Such verifica- 
tion is necessary since the computations are quite formidable. 


Example 1. Probléme des rencontres. In this example the set of conditions 





are: fori = 1,2,...,m, “i is in position i.”’ In this case the formulae become: 
1 1 a5 
M(0) = 1 —Tta---- td ni’ 
1 ea citi i 
un =3(0-443 Sahat ata): 


P,(E) = (1 — E)". 
P,,(E) satisfies the recurrence formula 
P,(E) = (1 — EB) Pa_-i(€). 


-_ 








St ee 


ASYMPTOTIC SERIES FOR PERMUTATION PROBLEMS 237 

Asymptotic formulae are: 
M(0)~e", M(r) ~ 
In this case the general asymptotic series of the form 


“4 ) 
1+°* “ha 5 | ee 


does not exist. The reason for this is that convergence to the asymptotic 
value is so rapid that ¢,, cs, etc. are all 0. 








M(r) = 


Example 2. Probléme des ménages. The conditions of this problem are: 
fori = 1,2,...,m — 1, “i is in ith position” and “i is in (¢ + 1)th position,” 
together with “‘n is in mth position’’ and “‘n is in first position.’’ The requisite 


formulae are: 
an_\(2n — i\(n — i)! 
eng: as) 1 ) n!' : 


ui) = Eg) > N()esa, 


the recurrence formula for P,(£) is given by, 
P,(E) = (1 — 2E) P,-1(E) — E*P,_2(E), 


M(0) 


lI 





and the asymptotic formulae are: 











no {9 ) 
Mo) = (1 @—1) + 2m 1 wt: +ae@—-p,t-° 
m1 ( (r —1)(r — 4) , r* — 147° + 51r’ — 387 — 6) 
ae? 4n " 32n(n—1) 
+ O(n). 
Here n;, is the Jordan factorial notation for n(m — 1) (n — 2)... (nm —k +1). 


These results were obtained by Kaplansky and Riordan in (5) and check with 
the formula developed here. 


Example 3. Ménages non-circulaires. This differs from Example 2 only in 
the omission of the condition “m is in first position.’ Here the formulae 


are: 
M(0) = y (-1 (2 “ Ye= 2 = z 
vee be ie 8 
M(r) = 2D ( 1) ( ; ar weak 


the recurrence formula for P,,(£) is, 
P,(E) = (1—2E) Pa_i(E) —E* Py-2(E), 








238 N. S. MENDELSOHN 


which is identical with that of the ordinary ménages problem; the asymptotic 
formulae are 


MO) = (I~ 5555 1) + 0-4, 


e*2"(. rr —3) , re — 10r’ + 237" + Or — ) “6 
mae * 32n(n — 1) + Ow). 


These results were also given in (5). 





Example 4. Rook-king problem. This example is a case of a relative condition 
problem. The conditions are “1 immediately precedes 2,”’ ‘‘n immediately 
precedes » — 1” and for i = 2,3,4,...,2-— 1, “i immediately precedes 
(i — 1)” and “i immediately precedes (i + 1).”. This problem had been 
treated in (3) and (4). No convenient exact expressions for M(0) and M(r) 
have been found. The recurrence formula for P,(E) is given by 


P,(E) = (1 — E) P,-1(E) — EP,-2(E). 
The formulae developed later yield the following asymptotic series: 


anf 2 —3 
M(0) =e ( _ ia) + O(n-*), 





and 
—2or 4 3 2 
_¢é€ 2 r(3 — r) r — &r° + 9r° + 22r — 16 <i 
My) = r! (1 + 2n + 8n(n — 1) + Oe"). 
Example 5. The final example is one which has not appeared anywhere in 
the literature. The set of conditions is: ‘‘1 is 2nd,” “ m is (wm — 1)th’”’ and for 
i = 2,3,4,...,2—1, “tin (¢ — 1)th” and “iis (4 + 1)th.” In terms of the 


chessboard representations the marked squares are precisely those on the two 
diagonals adjacent to the main diagonal. The recurrence formula for P,(E£) 
has been obtained by the present author by the method given in his paper (7). 
Exact formulae for M(0) and M(r) are not readily obtained but P,(E) satisfies 
the recurrence formula 
P,(E) = (1 — E) Py-s(E) + (—E + E*) P,-2(E) + E*P,-:(£). 

This together with 

P(E) = 1—2E+ E*, P(E) = (1 — 4E + 4E’), 

P,(E) = 1 — 6E + 11E* — 6E* + E* 
are sufficient to define P,(E£) for all m > 2. The first three terms of P,(E) are 
readily computed to be 

P,(E) = 1 — (2m — 2) E + (2n? — 7n + 7) E?+.... 

The formula to be developed yields the asymptotic expressions: 


M(0) = (1 + + sn) + O(n~*) , 


~ 








a | = al 


—_ we we "— 





—~ 


ASYMPTOTIC SERIES FOR PERMUTATION PROBLEMS 239 


and 


—2o5r 2 a ‘ 3 2 
Mtr) =! x, -& Bonk) We tet 18) 5 O65, 








r! 4n 32n(n — 1) 


A peculiarity arises here. There is a ‘‘pseudo recurrence formula” 
P,(E) = (1 — 2E) P,-,(E) — E*P,-2(E), 


which is not satisfied by the P,(£) associated with this example. Nevertheless, 
this “‘pseudo recurrence formula” yields the correct asymptotic series. The 
reason for this is that if correct values of P,_,(£) and P,_:(E) are substituted 
in the “pseudo recurrence formula’’ the formula yields the correct polynominal 
P,,(E) except for the term in E*. Asymptotically, this term is of no importance. 
The author has constructed several examples of problems which can be 
associated with ‘‘pseudo recurrence formulae” which are simpler than the true 
recurrence formulae and which yield the proper asymptotic series. However, 
no general theory of this phenomenon has as yet been formulated. 


5. The general theory. From this point on, only permutation problems 
whose polynominal operators satisfy a difference equation of the type 


(2) Pa = (a; — BE) Paes + (a2 — B2E + ¥2E*) Pao + (as — BE 
+ y3E* — 5:E*) P,-s 


+ eee + (ay = B,E + 7, E? + eee + (—1)*\,E*) P,-:(E). 


where P, = P,(E£), k is a fixed integer and all the Greek letters are constants, 
will be considered. It does not seem possible to give a precise characterization 
of the problems whose operators satisfy such a recurrence formula but some 
relevant observations may be made here. The total number of conditions 
possible is m*. If in a specific problem the number of conditions in the set S 
is of the form an? + bn + c with a # 0, no recurrence formula of the above 
type is possible. It is also true that there is no asymptotic series of the type 
—A Aart 
M(r) =£ 41 4+2+...). 


r! 





Examples of such problems have been given by Kaplansky and Riordan in 
(6) but they are outside our scope. If the set S had kn + / conditions a recur- 
rence formula of the given type is possible and if these conditions form a 
reasonably regular pattern of marked squares in the chessboard representation 
the existence of a suitable recurrence is likely, and probably can be obtained in a 
routine way, by the methods given in (6) or (7). 

Assume now that a recurrence formula for P,(£) exists and is given by 
equation (2). An asymptotic series of the type given in equation (1) is sought. 
In this connection it is possible to show that the method given by Kaplansky 
in (3) yields a result of the form 
e 4A’ 

r! 


M(r) = 





(1 + c) + O(n-*) 








240 N. S. MENDELSOHN 


in those cases where a; > 0, 8; > 0 (¢ = 1,2,3,...,%). In what follows it 
is assumed that the complete asymptotic series exists and the work is confined 
to the computation of the c, under this assumption. The polynomial P,(£) 
is given by: 


(3) P(E) = 1 — on E + On. BE? +... + (1) "bank". 


Using equations (2) and (3) it follows by complete induction that ¢,; is a 
polynominal of degree i in m with coefficients which are functions of 7. It is 
convenient to express ¢, ; in the form 


(4) base = Cong + CLO (m — Da + Co (m — eet... + CM 


The notation , is the Jordan factorial notation defined previously. To avoid 
complications of notation we define ¢,,o = 1 and ¢,, = 0 ifr <Oorr>n. 
On substituting the expression for P,(£) as given by (3) into the recurrence 
(2) and arranging the result in powers of E the following relations are obtained: 


(from the constant term), and 
k k k 

(5) on.r = Le aiby—t.r + p> Bs Oa—t.2-1 + p> V1 Oa—t.r—at .-- H+ Az be-2.1-2- 
tS = i= 


The expression (4) for ¢,,; is substituted into equation (5) to yield an expression 
which will be referred to as equation (6). Because of its extreme length, 
equation (6) is not written down here. On comparing coefficients of m’ in 
equation (6) and by the use of induction the following result is obtained: 


i) _ A’ _— Bit Bot... + & 
(7) Co seo 7s... +h 


To compute C,°” the following procedure is used. First, all the ¢,,, occurring 


in equation (6) are expressed as factorial polynomials in terms of the variable 
n — k — 2 by making use of the relationship 


(n + t)y = My + tu Myr +t — ew = 2) My — ... 








Then the coefficients of (n — k — 2),-2 in both sides of the resultant equation 
are equated to yield 

A r—2 
(r — 1)!’ 





(8) Cc,” — mn gr — 


where 





el 


—_—_— = 








I 











Ne eee 








<a -— 


ASYMPTOTIC SERIES FOR PERMUTATION PROBLEMS 241 


k 

L= > (k-i+2) 6, 
t=—1 
k 


M = Drs 


t=? 


P=1¥ &-i+ tat - +0) 


The recurrence formula for P,(£) does not determine C,“ but this can be 
obtained from a knowledge of P;(Z). In terms of C,“, B and A it is easily 
seen by induction that equation (8) has as solution 


4c (1) AB 
(r) - 1 — 
(9) a” -¢-1I  G—-a 


At this point the first two terms of the asymptotic series will be computed. 
From the relationship M(r) = P,(E£) g,*(0), the result 


M(r) = 1S A) —r)ji- winlt’* Vin —r—1)! 


+ ike i Vn =f = 2)! — 26 \ 





r 


is obtained. This reduces (on substituting for ¢,,, the expression (4)) to: 


M(r) on {c.” a ("+ ') Cc, + ( 2 2) Cc," -.. } 


= Yow _ (" + ') Ct) 4 ( + " cv , 
n\ 1 2 ) 


+ O(n-*). 


Substituting for Co“ and C, the expression obtained in (7) and (8) yields 
the equation 


A’ (r+1\ a™ es A™ 
ue) = 44 -( 1 Act 2 ee 
= fF -(+ (+4 — 
+e, ‘Ao i J/aAt\ 2 een 4 


A’ (r+ ') A res A’ | 
- 3 4—, - 1 J@-pit\ 2 ‘J 


+ O(n-*). 











This reduces to 


we = Af + Heo) afte +f] eown 











242 N. S. MENDELSOHN 


In the case where r = 0, the result further reduces to 
(1) 
M(0) = oti - B+) + O(n). 


The author has computed two further terms of the asymptotic series. 
The results are quite complicated in form. The term in 1/n(m — 1) has been 
verified by means of the examples quoted previously. It is given here for 
completeness but all the computations have been omitted as they are quite 
involved and do not utilize any new idea. The final formula contains the 
number C;“ which is not obtainable from the recurrence formula for P,(£). 
All that is required for the computation of C,™ is a knowledge of P;(Z). 
The final result is 


M(0) = oti -*B+c")+ en ~ TAs B\) 


1 (2) r(ry—1) 2 ) 
+a ( ica lhe 


+i - I(r — 2) 3r(r — 1) 7 1) 


+ O(n-*), 











A A A’ A 
a(r(r — 1) — 2) _ (ry — 1) (2r —1) , Br? Or $1 1)¥] 
+3 24° A — oe re 
+ O(n). 


All terms in M(r) except I have been previously defined. The value of T is 
given by the expression: 


Fae 5{ (RA + SA* + TA’ + UA) + C;°(KA* + LA? + MA) 
— B(KA’® — M)}, 


where 


z 
T= > vik —i1 +3), 
t—2 


z 
U= >d &. 
t—3 








ASYMPTOTIC SERIES FOR PERMUTATION PROBLEMS 243 


The above formulae while formidable in appearance are quite simple to 
apply in practical cases. In none of the five examples quoted did the compu- 
tations require as much as five minutes. 


6. Distribution moments. The difference equation for P,(Z) may be 
used to yield all the moments of the distribution of M(r) as well as the asymp- 
totic series. In this section formulae are established for the mean m and the 
variance v of the distribution. 

The computation of moments is most easily carried out by the use of the 
notion of a factorial moment. The factorial moments are more natural to the 
type of problem considered in this paper than are the more usual power 
moments. The ith factorial moment of the distribution of the number of hits 
is defined as M“, where 


M® = > rr — 1)(r — 2)... (7 —i +1) M(x). 
r=0 
It has been shown in (4) and (3) that 


uw? = a./(%). 


Actually, this result follows easily by a direct computation. In terms of these 
factorial moments the mean m and the variance » are given by: 


m= M®, »9 = M® + MM — {MM}?2, 
In terms of the constants computed in this paper these formulae become: 
7 FY 
m= ft 94S 24 4S, 
n n n 


mM” + Mm = {um}? 


2¢n.2 ( ae) -( ay 
se-i*“t. se" 


(2) (2) ) (y\2 
2c, 4 207 5 26° 1 4 G -(4+% ) 
n n(n — 1) n 


e 
i 











= 2c; a _( acy 
= A’ + SAC.” - B) + + 4+ A+ 


(1) ,2 
=A + tc” _ 2B) _ [Cr } + 


n n(n —1)° 











244 N. S. MENDELSOHN ’ 


REFERENCES 


. T. S. Broderick, On some symbolic formulae in probability theory, Proc. Roy. Irish Acad., 44 
(1937), 19-28. 
2. M. Fréchet, Les probabilités associées @ un syst2me d'événements compatibles et dépendents, 
Actualités Scientifiques et Industrielles, nos 859 et 942 (Paris, 1940 and 1943). 
3. I. Kaplansky, Symbolic solution of certain problems in permutations, Bull. Amer. Math. Soc., 
50 (1944), 906-914. 








4. , The asymptotic distribution of runs of consecutive elements, Ann. Math. Statist., 16 
(1945), 200-203. 
5, I. Kaplansky and J. Riordan, Le probléme des ménages, Scripta Math., 12 (1946), 113-124. 
6. , The problem of the rooks and its applications, Duke Math. J., 16 (1946), 259-268. 
7. N.S. Mendelsohn, Symbolic solution of card matching problems, Bull. Amer. Math. Soc., } 


52 (1946), 918-924. 





8. , Applications of combinational formulae to generalizations of Wilson's theorem, Can. 
J. Math., 1 (1949), 328-336. 
9%. J. Riordan, Three line Latin rectangles 11, Amer. Math. Monthly, 53 (1946), 18-20. 


Unwersity of Manitoba 








MAXIMAL DETERMINANTS IN 
COMBINATORIAL INVESTIGATIONS 


H. J. RYSER 


1. Introduction. Let Q be a matrix of order », all of whose entries are 0's 
and 1’s. Let the total number of 1’s in Q be t, and let the absolute value of the 
determinant of Q be denoted by |det Q|. In this paper we study the problem of 
determining the maximum of |det Q| for fixed ¢ and v. It turns out that this 
problem is closely related to the v, k, \ problem, which has been extensively 
studied of late. 

A v,k, configuration is defined as an arrangement of v elements x, x2, 

., x, into v sets S,, S2,...,S, such that each set contains exactly k distinct 
elements and such that each pair of sets has exactly \ elements in common 
(0 <A <k <1). If element x; belongs to set S;, let aj; = 1; and if x; does not 
belong to S;, let aj; = 0. The v by v matrix A = [a,;] is called the incidence 
matrix of the v, k, \ configuration. These matrices have been very useful in 
establishing the nonexistence of certain configurations (1; 2). A general sur- 
vey of the literature pertaining to v, k, \ configurations may be found in (4). 
In particular one proves that in a v, k, \ configuration, 


k— =k? — bo 
and 
AA™ = A'A = B. 


Here A* denotes the transpose of the incidence matrix A, and the matrix B 
has k in the main diagonal and ) in all other positions. It is easy to see that 
det B = k?(k — )*', whence it follows that 


\det A] = k(k — vA)», 


2. Theorems on maximal determinants. 


THEOREM 1. Let Q be a0, 1 matrix of order v, containing exactly t 1's. Letk 
denote a positive real, and set } = k(k — 1)/(v— 1). If t< kv and 0 <X 
<k— , orift> kvand0 <k —X <i, then 

idet O| < k(k — vA)», 
Let E be a 0, 1 matrix. Let E(x, y) denote the matrix formed from E by 


replacing each 1 of E by x and each 0 of E by y, where x and y are indetermin- 
ates. Using this notation, we may write 


Q; = Q(—(k — A)/A, 1). 
. Received May 31, 1955. 
245 








246 H. J. RYSER 





Now set p = (k — d)/A, and define the matrix Q of order » + 1 by 


p 2 

(1) Q es E A ’ 

where z = (./p,..., Vp). By the Hadamard determinant theorem, 
(2) idet Q| < \/p" + vp I Vets, 


where s; denotes the sum of the squares of the ith row of Q,. Now 


2 
+p = ‘cS ati) He — 2). 


Moreover, 


Sit... +s, = tp? + (vo? — 2) = t(p? — 1) + 0°. 


By hypothesis, ¢ < kv and p* > 1, or t > kv and p* < 1. 


conclude 


Sit... +5, < ko(p? — 1) + 2”. 


Now introduce quantities 3; such that 


5 > Si 
and 
(3) i+... +5, = v(kp? +0 — R). 
By (3), 


2D, ( + 8) = v(kp’ +0 —k + P) = vfkp’ + (Av — AR +k —A)/Al 


= vkp(p + 1) = v(k — d)R’/d’. 


Since the geometric mean of v positive quantities is less than or equal to their 


arithmetic mean, we may write 


(4) I] @ + 4) < (2% +50), 
whence 
(5) I (p + 3.) < (k — d)*R™*/X*". 


Hence by (2), 


(6) |det Q| < eh ~- ATT VP + 3 





Hence we may 





—-—« 


—a, 


COMBINATORIAL INVESTIGATIONS 247 


To evaluate det Q, multiply row one by —1/+/p and add the resulting row to 
each of the other rows. From (6) it follows that 


(7) idet Q| = pldet Q(—k/d, 0)| < (ke — X/a)™". 
But 
|det Q(—k/d, 0)| = (k/A)*|det Q|, whence 
pldet Ql < = (Vk —0)™, 


and 


|det QO] < k(\/k — dX)”. 


Using the notation of Theorem 1, we have 


Tueorem 2. If \det Q| = k(k — d)**-”, then Q is the incidence matrix of a 
v, k, \ configuration. 


If equality holds in Theorem 1, then 
k ky/k — »\""" 
det of - t 0) -( x ) ; 
and by (7), 


(8) ldet Q| = (k\/k — 4/a)"™". 


Equality in (6) implies equality in (5) and (4). But for equality to hold in (4), 
we must have 








p 


pb + 8 = (k — A)k*/d*. 
But then the equality in (6) implies 
9) ggr = FAS ;, 
where I is the identity matrix of order » + 1. Thus 
(10) 2:07 = ae — a) I— pS, 


where Q, = Q(—>, 1), and S is the v by » matrix of all 1's. Let e denote the 
number of 1's in row r of Q. Then 


2 
pe + (v—e)-1 = oa (kd) - 2, 
and 
, RB 
(p° — lhe = 3 (k-A)-P-», 


whence we conclude that e = k. Let f denote the inner product of rows r and 
s of Q, where r # s. Then 








248 H. J. RYSER 


fo? — 2(k — fp +v—2k+f = —p, 
whence 
f(p? + 2p + 1) = 2kp — p + BW — 2», 


and fk?/\? = k?/x. Thus f = \, and Q is the incidence matrix of a v, k, A 
configuration. 
It is now clear that we have established the following: 


THEOREM 3. Let Q be a 0,1 matrix of order v, containing exactly t 1's. Let 
k = t/v and set = k(k — 1)/(v — 1), with0 <X < k < v. Then 


det Q| < k(k — a)», 
and equality holds if and only if Q is the incidence matrix of av, k, \ configuration. 


Consider once again Theorem 1. Note that (k — A)/A = (» — k)/(Rk — 1). 
Thus the requirement A < k — A means k < 3(v + 1), and k — A < A means 
k > 4(v + 1). Suppose that k = }(v + 1). Then if Q is a 0, 1 matrix with no 
restriction on the number of 1’s, we must have 


(v + j)ier? 
os 


The incidence matrix associated with the case of equality has parameters 
v = 4, — 1, k = 2h, A = X. These incidence matrices give rise to the Hada- 
mard matrices of order 4\ (3). The determination of the maximum of |det QJ, 
where Q is of arbitrary order v, is an unsolved problem of considerable diffi- 
culty (5). 

If we place no restriction on the number of 1's in the 0, 1 matrix Q of order » 
and assume that |det Q| = k(& — d)**-”, then we may not conclude in general 
that Q is the incidence matrix of a v, k, A configuration. For example, let A be 
an incidence matrix of a v, k, A configuration with v — 2k > 0. Define its com- 
plement C by A + C = S, where S is the matrix of all 1’s. The complement of 
A is again a v,k, configuration with parameters 6 = v, k = v — k, and 
XK = v — 2k + X. Note that 


ldet C| = (vw — k)(k — A)”. 


(11) ldet Q| < 


It is easy to check that 


—1 1 r_A 
A oe ss), 


where A~! denotes the inverse of A. Thus in A = [a,,], if a,, = 1, then the 
cofactor of a,,, 


Ay= i det A. 


Similarly for the complement C = {<-.], if cy, = 1, then the cofactor of c,,, 








COMBINATORIAL INVESTIGATIONS 249 


1 
Cy. = nye j aet 


We are assuming that » — 2k > 0. Thus we may replace v — 2k of the 1's 
in the first row of C by 0’s. The resulting matrix Q is a 0, 1 matrix satisfying 


ldet O| = k(k — vA)», 


but Q is not an incidence matrix of a v, k, \ configuration. 





REFERENCES 


1. R. H. Bruck and H. J. Ryser, The nonexistence of certain finite projective planes, Can. J. 

Ms Math. 7 (1949), 88-93. 

2. S. Chowla and H. J. Ryser, Combinatorial problems, Can. J. Math., 2 (1950), 93-99. 

3. R.E.A.C. Paley, On orthogonal matrices, J}. Math. Phys., 12 (1933), 311-320. 

4. H. J. Ryser, Geometries and incidence matrices, Slaught Memorial Papers (Suppl. Amer. 
Math. Monthly), 62 (1955), 25-31. 

5. John Williamson, Determinants whose elements are 0 and 1, Amer. Math. Monthly, 53 
(1946), 427-434. 


Ohio State University 








ON A CLASS OF ALMOST ALTERNATIVE ALGEBRAS 


L. A. KOKORIS 


Introduction. In the study of almost alternative algebras (2) relative to 
quasiequivalence an important class called algebras of (y,8) type arises. 
An algebra of (7, 4) type is a finite dimensional algebra & over a field § 
‘satisfying the identities 


(1) a(xy) = (2x)y + y(xz)y — yx(zy) + 5(ys)x — dy(zx), 
and 
(2) (xy)z = x(yz) + y(xz)y — yx(zy) + (6 — 1)(y2)x — (6 — 1)y(2x) 


where and 6 are elements of § satisfying y? — 6? + 6 = 1. We shall restrict 
our study to (y, 5) type algebras with characteristic #2, 3, or 5 and with 
6 # 0,1. With these restrictions the algebras are power-associative. Also, 
Albert has shown (2, p. 36) that if an algebra & of (7, 8) type has an idempo- 
tent ¢ it can be decomposed into a supplementary sum W = Wy, + Wio + Aor + 
Woo where x is in U,, if and only if ex = ix and xe = jx. The subspaces of our 
decomposition have the same multiplicative properties as in the case of an 
associative algebra. 

The concepts of a solvable algebra, nilpotent algebra, and nil algebra are 
equivalent for (y, 6) type algebras with the restrictions mentioned above 
(2, p. 35). The radical is defined to be the maximal nilideal and it is then 
proved that a simple algebra is either associative or contains a unity which is 
an absolutely primitive idempotent. A semisimple algebra is a direct sum of 
simple algebras. 

If 6 = 0 or 1 we have the four pairs (y, 6) = (1,1), (—1,0), (1,0), or 
(—1, 1). The pair (—1, 1) implies that the algebra is right alternative and 
(1, 0) implies the left alternative law. In the remaining two cases we are not 
able to obtain the same multiplicative relations for the subspaces of the 
decomposition as for the general case and it seems that the results here should 
be different. 


1. Decomposition relative to an idempotent. Let & be an algebra of 
(y, 6) type with characteristic #2 and with an idempotent e. If (vy, 5) # 
(—1, 1) or (1, 0), it is known that & may be decomposed into a vector space 
direct sum &% = Wi + Aro + Aoi + Avo. This is the decomposition of the 
theory of associative algebras and we are able to obtain the multiplicative 
relations of the associative theory when 6 = 0, 1. 


Received July 25, 1955. Presented to the American Mathematical Society, September 1, 
1955. 


250 








ALMOST ALTERNATIVE ALGEBRAS 251 
THEOREM 1. Let W be an algebra of (7, 4) type with 6 # 0, 1 and characteristit 
2, 3. Then + = 0 if j ca q and bo < >, 


The proof is made by considering the various cases. Take z = e, x in Wy), 
and y in &,,. Then (1) becomes 


(3) e(xy) = (¢ + jy — qy)xy + (% — i6)yx. 
Interchanging x and y gives 


(4) e(yx) = (¢ + ty — ty) yx + (16 — gd)xy. 
With x, y, z as above, (2) becomes 

(5) (xy)e = (t + jy — qy)xy + (6 — I(t — dyx. 
Interchanging the roles of x and y we have 

(6) (yx)e = (jf + ty — ty)yx + (8 — 1G — Q)xy. 


Now consider the case where x and y are in %,, so thatti = 7 = gq =t = 1. 
Relations (3) and (5) yield e(xy) = xy, and (xy)e = xy. Therefore, W: is a 
subalgebra. The values i = j = gq = t = 0 in (3) and (5) prove that Woo is 
also a subalgebra. When x is in %;, yis in Woo, (3) and (4) givee(xy) = (1+) xy 
— byx and e(yx) = — yyx + dixy. We now use the fact that L,? = L, 
(later (cf. 2, p. 36) we shall also need R,? = R,) to see that 


ele(xy)] = e(xy), e(xy) = (1 + y)[e(xy)] — de(yx). 


It follows that (y? — 6? + y)xy = 0. Since y? — 6° + y = 0 together with 
the defining relation y? — 6* + 6 = 1 for an algebra of (7, 4) type implies 
5 = 0, we must have xy = 0. Also, 


ele(yx)] = e(yx), —vyelyx) + be(xy) = e(yx). 
Consequently (y? — 8? + y)yx = 0 and so yx = 0. Thus Wy, and Woo are 
orthogonal subalgebras. 
If x is in WM, and y isin Wyo, we have e(xy) = xy — dbyx,e(yx) = (1 — y) yx 
= (yx)e, and (xy)e = (1 — 5)yx. Then ele(xy)] = e(xy) implies de(yx) = 6 
and it follows that yx = 0. This also proves that xy is in Wi. Next let x be in 
YW, and y be in Wo: so that 


e(xy) = (1 + y)xy = (xy)e, e(yx) = xy, (yx)e = yx + (6 — I)xy. 
The result xy=0 is obtained by noting that [(yx)ele=(yx)e and (6—1)[(xy)e} 
= 0. Then yx is in Mo. 
Consider the case where x and y are both in Wy» and 
e(xy) = (1 — y)xy — byx, e(yx) = (1 — y)yx — ixy, (xy)le = — yxy + 
(1-— d)yx, (yx)e = — yyx + (1 — d)xy. 
From e[e(xy)] = e(xy) and ele(yx)] = e(yx) we obtain (7 + 4)le(xy) + 
e(yx)] = 0. Since y + 6 # 0 by hypothesis, 
e(xy) + e(yx) = 0 = (1 — y — 8)(xy + yx). 








252 L. A. KOKORIS 


Again 1 — y — 6 #0 byhypothesis,so xy + yx = 0. Thus 


e(xy) = (1 — y + d)xy, (xy)e = (—1 — y + 4)xy. 
Moreover, 


ele(xy)] = e(xy) = (1 — vy + d)e(xy), (1 — y + 8)(—y7 + d)xy = 0. 


When the characteristic is not 3, 5 # 0,1 implies xy = 0. The case with x 
in Mio and y in Wo; is proved immediately by substituting in (3) to (6). If 
both x and y are in Wo, e(xy) = yxy + dyx and e(yx) = yyx + dxy. Therefore 


ele(xy)] = e(xy) = ye(xy) + de(yx), ele(yx)] = e(yx) = ye(yx) + be(xy) 


when added give (—1 + y + 4)[e(xy) + e(yx)] = 0. Hence (y + 4) (xy + yx) 
= 0 and thus xy = — yx. We then have 


e(xy) = (y — d)xy, ele(xy)] = e(xy) = (vy — 4)[e(xy)]. 
This implies (y — 6 — 1)(y — é)xy = 0, xy = 0. 

Take x in Ay and y in Ago. Then e(xy) = xy — yx, (xy)e = (1 — 5)yx, 
and (yx)e = — yyx. We have [(xy)ele] = (1 — 4)[(yx)e] = (xy)e and (1 — 8) 
(1 + y)yx = 0. Our hypothesis on 6 implies yx = 0 and it follows that xy 
is in Wo. The !ast case is with x in Mo; and y in Woo. Relations (3) to (6) 
become 

e(xy) = yxy, e(yx) = dxy, (xy)e = yxy, (yx)e = yx + (6 — 1)xy. 


Also ele(yx)] = e(yx) = de(xy) and so 6(1 — y)xy = 0. Since 6(1 — y) ¥ 0, 
xy = 0 and yx is in %:. This completes the proof of Theorem 1. 

2. Power-associativity. When x = y = 2, relation (1) becomes (1+~7+84) 
(xx* — x*x) = 0 and (2) yields (2 — y — 5)(xx? — x*x) = 0. Addition of the 
two expressions gives xx* = x*x if the characteristic #3. Assume that Wf is an 
algebra of (7, 6) type with characteristic #2, 3 and let z = x*, y = x in (1) 
and (2) to obtain 
xx? = (1 + 7 + 8)x*x — (y + d)xx*® = (—1 + y + 8)x*x — (—24+74+8)xx*. 
It follows that 2x*x = 2xx* and x*x? = x'x = xx*. If also W has characteristic 


#5, it satisfies the hypotheses of the known (1, Lemma 4): 

LemMMA 1. Let & be an algebra with characteristic #2,3,5 and xx* = x++ 
forxt+u<n,n>5. Then 
(7) xg = xy + <— Li, x] (a=1,...,"—1) 
where |[x"—', x] = x"-'x — xx"*—'. Also, n[x"“', x] = 0. 


The Lemma will be used to show that an algebra of (7, 6) type is power- 
associative if its characteristic #2, 3, 5. Write x* for x, x® for y and x*-** 
for z in (1) where a, 8 are positive integers such that a + 8 < m and assume 
that x*x* = x + for \ + uw < n to obtain 


ee By At B — "BB 4 yy By — yyay"—2 4 Gy" axe — §xhx*8, 











ALMOST ALTERNATIVE ALGEBRAS 253 


By (7) we have after multiplying by 2, 
(a + B — 1)[x*", x] = (1 + y) (6 — Dx", x] — y(n — @ — 1)[x*", x] 
+5(a — 1)[x*"', x] — 5(m — B — 1)[x*"', x]. 
Thus either [x*-', x] = 0 or 
a+B-—-1=8-—-1+ WB-y+ yat7 + ba — 6 + 58 + 6. 


If [x"-',x] = 0, (7) implies & is power-associative. Otherwise a = (y + 3) 
(a + 8). Since a and @ are any positive integers, restricted only by a + 8 < n, 
interchange a and 8 to obtain 8 = (y + 8)(a+ 8). Adding, 


a+ B = 2 (7 + 4)(a + 8) 


and a = 8 = 1 implies 2(y + 5) = 1. But it is impossible for y and 6 to 
satisfy both this equation and y*? — 6? + 6 = 1. 


THEOREM 2. An algebra U of (y, 5) type whose characteristic #2, 3, 5 is 
power-associative. 


3. Simple algebras. From this point on we shall consider algebras of 
(y, 5) type with 6 # 0, 1 and with characteristic #2, 3, 5 so that we may use 
the results of Theorems 1 and 2. We shall make use of the associator (x, y, 2) 
which is defined by (x, y, z) = (xy)z — x(yz). If & is an algebra with an 
idempotent e we may prove the following result. 


LEMMA 2. The associator (x, y, 2) is 0 if one of the elements x, y, 2 is in Uo 
or Wo. 


First consider the possible ordered triples with xi in Wo on the left and 
y, z in the decomposition subspaces. It is clear that by linearity we need only 
consider elements in the subspaces of the decomposition. By Theorem 1 it is 
clear that the only triples with x,» on the left giving nonzero products are 


X10, Vor, 211; X10, Yor, 210; X10, Yoo, 201; X10, Yoo, Zoo, 


where the subscripts indicate the subspaces in which the elements lie. Let 
x = X10, ¥Y = Yor, Z = 2: in (2) and use the fact that our decomposition is 
supplementary to obtain X10(¥o1%11) = (X10¥01)¥11. Similarly we prove the 
result for the second and third triples. For the last triple we use (1) with 
Z = X10, X = Yoo, ¥ = Zoo to get X10(Yoo%00) = (X10¥00)Z00. 

Triples with yy» in the middle giving nonzero products are 


Xity Vio, 201; X11» Vio, 200; Xo1r, Vio, 201; Xo1y Vio, Zoo- 

The result of the Theorem is proved by making the obvious substitutions in 
(1) for the first two of these triples and in (2) for the last two. 

There are also four triples with 2:9 on the right giving non-zero products. 
These are 

Xity Vir» 210; Xo1r, Vir, 210; X10, Vor, 210; Xoo, Vor, Z10- 

For the first three substitute in (2) and use (1) for the last triple. By symmetry 
we have the result for elements in %po;. 











254 L. A. KOKORIS 


Coro.LLtary. The algebra & is associative if and only if Ay, and Aoo are 
associative. 


Now let & be a simple algebra. There must be a nonnilpotent element x 
in & and the subalgebra generated by x must be associative since & is power- 
associative. Since an associative algebra not a nilalgebra has an idempotent, 
& has an idempotent e. Decompose ¥ relative to e. Then the sets 


B = Wis + ro + Wor + Moi Aro, ¢ -= Aoo + Aro + Wor + Wio%or 


can easily be seen to be ideals of Y. Since ¢ is in 8, B = Wand thus Apu = Ao: 
10. It follows from this and Lemma 2 that Woo is zero or an associative algebra. 
In the latter case Woo is simple for if Boo were a proper ideal of Woo, then Boo 
would generate the proper ideal 


Boo + Wi0oBoo + Booor + Yi0Bo0%or 


of M%. The ideal € = HM or 0. If C = A, Ar. = AwA%or and Ay, is a simple 
associative algebra. If € = 0, MH = Wi: and ¢ is the unity element of W. In case 
¢ = u + 0 is not primitive, we can get a proper decomposition with respect 
to # and with the new Woo + 0. Then W is associative. When ¢ is not absolutely 
primitive we can find a scalar extension & of the base field § such thate = u +0 
for pairs ‘se orthogonal idempotents u, v in Ag. Consequently Wg is associative 
and % is associative. 


|TuHEeoreM 3. Let & be a simple algebra of (vy, 6) type with 6 # 0, 1 and with 
characteristic #2, 3, 5. Then U is an associative algebra or U has a unity quantity 
which is an absolutely primitive idempotent. 


4. Semisimple algebras. The study of semisimple algebras begins with 


THEOREM 4. Let e be a principal idempotent of an algebra & of (y, 5) type with 
& # 0, 1 and characteristic ~#2, 3,5. Then Aio + Wor + Woo is contained in the 
radical N of A. 


The proof is made by an induction on the order of Y. The result is clear when 
&@ has order one. Assume the Theorem for all algebras of order less than n 
and let & have order n. If & is not semisimple we consider 8 = A — RN which 
has order m < n. The principal idempotent e of & corresponds to a principal 
idempotent u of 8. Decompose % relative to u. Since 8 is semisimple our 
induction hypothesis simplies B10 + Bo: + Boo = 0. This implies that in the 
decomposition of & relative to ¢, %io + Wor + Woo C RN. 

If & is simple, Theorem 3 implies & has a unity e and an algebra with a 
unity has no other principal idempotent. Thus we may pass to the considera- 
tion of a semisimple algebra & with a proper ideal B. The ideal B can not be a 
nilideal so it must contain an idempotent and hence a principal idempotent e. 
Then B = Bi: + Bio + Bor + Boo and we may also decompose & relative to e 
so that M = Wi: + Aro + Aoi + Avo. The idempotent e is in B and so if 





ALMOST ALTERNATIVE ALGEBRAS 255 


ex = x or xe = x, it follows that x is also in 8. Consequently, A = By, + By 
+Boi + Yoo. By the induction 8 has radical M = Mii + Bio + Bor + Boo 
where 2; is the part of M in Bi. Since B is an ideal of W it follows that M 
is a nilideal of & and that M = 0. Therefore A = B @ Woo and ¢ is the unity 
quantity of 8. The subalgebra Woo is an ideal of & and by a repetition of the 
above argument Woo has a unity f. Then u = e +f is a unity for & and is 
therefore the only principal idempotent of &. This completes the proof' of 
Theorem 4. We have also proved 


THEOREM 5. A semisimple algebra of (y, 5) type with 6 # 0,1 and with 
characteristic #2, 3, 5 has a unity quantity and is a direct sum of simple algebras. 


'The reader should notice that our proofs follow those of Theorems 7 and 8 of Albert (3). 


REFERENCES 


1. A. A. Albert, On the power-associativity of rings, Summa Braziliensis Mathematicae, 2 
(1948), 1-13. 

, Almost alternative algebras, Portugaliae Mathematica, 8 (1949), 23-36. 

, A theory of power-associative commutative algebras, Trans. Amer. Math. Soc., 69 

(1950), 503-527. 





2. 
3. 





Washington University, St. Louss. 











ORTHOGONAL ISOMORPHIC REPRESENTATIONS OF 
FREE GROUPS 


J. DE GROOT 


1. Introduction. We consider the group @ of proper orthogonal trans- 
formations (rotations) in three-dimensional Euclidean space, represented by 
’ real orthogonal matrices (ay) (i,% = 1,2,3) with determinant +1. It is 
known that this rotation group © contains free (non-abelian) subgroups; in 
fact Hausdorff (5) showed how to find two rotations P and Q generating a 
group with only two non-trivial relations 


Pt = 0 = IJ. 


Now the elements PQPQ and PQ*PQ? are free generators of a free rotation 
group R (7). It was shown in (4), starting from R and using transfinite induc- 
tion, that @ contains even a free subgroup with continuously many free 
generators.’ 

Now it is clear that this method for constructing free subgroups of @ is an 
indirect one and furnishes only special free rotation groups. These disadvan- 
tages became more visible in certain problems (partly geometrical, partly 
group-theoretical) dealing with free rotation groups in spaces of dimension 
>3. Therefore we shall develop in this paper a straightforward and simple 
method of determining free subgroups of @ (we restrict ourselves, however, 
to the three-dimensional case). The only, but in many cases serious, difficulty 
with this type of problem is to prove time and again that certain products of 
matrices do not vanish identically. As our main result (Theorem II) we shall 
give, explicitly, continuously many rotations (with the same rotation angle, 
and rotation axes situated in the same plane), which are free generators of a 
free group (of continuous rank). Other representations of free groups were 
given by Fuchs-Rabinowitsch (3) and Doniakhi (2; see also Sanov 8), who use 
two-rowed square matrices; however, these cannot be orthogonal. Some 
conjectures are stated in $5. 


Received March 23, 1955. 


1After completion of this paper, the author heard from Poland that Sierpinski proved a 
lemma (9, 238) which, though not stated in terms of group theory, implies the existence of a 
free rotation group of countable rank, and from which the existence of a free rotation group of 
continuous rank can easily be deduced. Already, Sierpinski uses in his proof the “‘“von Neumann 
numbers” (see §4 of this paper). On the other hand we see that Sierpinski’s proof essentially 
makes use of the Hausdorff result (5), just as does the proof in (4). For this reason the methods 
derived in the present paper improve those in (9) and (4) and Theorems I and II cannot be 
obtained by employing the methods of (9), (4) only. 


256 


—— 





a ee 


REPRESENTATIONS OF FREE GROUPS 257 
2. Preliminaries. We consider polynomials V in the variables sin n¢, 
cos md (n, m ranging over the integers) over the real field. Each term 
I] sin” n° I] cos” m,@ 
i 4 
has degree 
» Irgms| + >» |sy m,|. 
‘ j 
The degree of V is the maximum of the degrees of each of its terms. 


LemMA I. A polynomial V, having only one term of degree equal to the degree 
of V itself, is a non-constant function of >. 


Proof. Using the formulae 


sinn ¢ = (") cos” sin ¢ — (") cos” sin’ ¢+... 


=[(t) + (2) +... Jeor*osin e+... 


cos m @ = cos” ¢ — ; cos” ¢sin?’ d+... 


~[14+(™)4+(™) 4+... Joo e+... 


V is transformed into a polynomial in sin ¢ and cos ¢, again having only one 
term of maximal degree. This expression of V can obviously be written in 
the form 


k 
(1) 2X (asin ¢ + 8, cos ¢) cos"‘ ¢ + 7, 


either a» or 8» being equal to zero. 

If k = 0 this polynomial does not vanish identically in ¢. Then the lemma 
follows by induction. Suppose the lemma holds for k — 1; suppose that for the 
value k the polynomial V = 0. Substituting ¢ = + 47 in (1) we finda, = y = 
0; thus we can divide V by cos ¢, and get a contradiction. So V # 0, and V is 
non-constant. 


Lemma II. Consider the real orthogonal matrices 
cos @ —sin ¢ 0 1 O 0 
A(@) =| sing cosd? OO}, Bld) =] 0 cos¢d —sin ¢ 
0 0 1 Osing cos¢ 


Any proper* product in terms of A and B is a non-constant matrix (depending 
on >). 


*The product is proper if it cannot be transformed into the unity-matrix J using only the 
trivial relations A-A“' = B-BoO=J,A-]=]-A =A,B-] =I-B =B. 











258 J. DE GROOT 


Proof. At first we prove the Lemma for products of the form 
(2) ne ae ee sa (m,, m, integers * 0) 
Ifk = 1, weget 
COS 416 —SsSin m,¢ COs m,¢ Sin 2, ¢ sin m,¢@ 
A™ B™ =| sinm¢@  cosm,¢Ccosm,¢ —cos m¢ sin mo 
0 sin mi cos m¢@ 


This is obviously a non-constant matrix. (2) is a matrix (a,,). Denote the 
degree of a,, by d,,. Suppose these degrees satisfy for k = / — 1 the relations 


. k-1 
lau, du < > |7 | + > \m,|, 
d t= (= 


(3) k 
" 3, dzz, dz3 = > (|m,| + |m.|), 
‘ bs 
while, moreover, each of the elements @;2, 413, @22, @23 has exactly one term of 
the corresponding degree denoted in (3). 
Now one sees easily by multiplying (a,,) (fork = 1 — 1) with 
, ad 
that (a,,;) (for k = 1) satisfies the same conditions. Since this is also true for 
k = 1, the Lemma follows by induction, applying Lemma I, for all products 
of type (2). 
If we multiply (2) on the right by 
AM 


we see also in the same way—using the properties mentioned—that this 
product depends on ¢. Since the interchanging of A and B in (2) does no 
harm, the Lemma is proved. 


Remark. In the proof it is possible, but not necessary, to consider the 
whole of the matrix (dq); one could also deal with the degrees of the second 
row only. One might also consider the degree of the trace of (ay) (independence 
of the chosen coordinate-system). 


3. Countable representations. Using the well-known substitution 


(4) @ = 2arctan x (0<¢<x7), 
which yields 
(5) iain oa 2x n _i- x” 

ing=Ty gp Meni Ty 


the expression (2) is rationalized in terms of x. If x is transcendental, we call ¢ 
associated transcendental. 


i 





REPRESENTATIONS OF FREE GROUPS 259 


Now we can state 


THEOREM I. The rotation group generated by the rotations A(@) and B(¢@), 
these two being rotations with rotation angle ¢ and with rotation axes perpendicular 


to each other, is a free (non-abelian) group (of rank two) for any fixed, associated 
transcendental value ¢. 


Proof. It follows from Lemma II that any element (2) of the group H, 
generated by A = A(¢) and B = B(¢), is a non-constant matrix, if it cannot 
be transformed into identity by using the trivial relations. Now any non- 
trivial relation in H can be written in the form 


(6) ris’ ..4°o ot, 


the product being a proper product if k > 0. Since (2) is a non-constant 
function of ¢, (6) can be transformed, using (4) in a finite number of algebraic 
equations in x, not all vanishing identically. So substituting for @ any fixed 
associated transcendental number ¢, no relation (6) is valid, and the theorem 
is proved. 


Since in a free group generated by A and B the elements A‘BA~ (i = 0, 1, 
2,...) are free generators of a free group of infinite (but countable) rank, 
we get 


COROLLARY. For any fixed associated transcendental @ the rotations with 
rotation angle ¢ and with rotation axes in the same plane and making angles 
io(t = 0,1, 2,...) with a fixed line in this plane are free generators of a free 
group (of infinite rank). 


4. Uncountable representations. J. von Neumann (6) proved that the 
set {x,} = M of distinct real numbers x,, defined by 


a Ing) n* 
eter ~ (t > 0), 


are algebraically independent over the field of rational numbers (no finite set 
{a,} of distinct numbers a, € M satisfies an equation P(y,) = 0, if P(y,) isa 
non-vanishing polynomial in the variables y, with rational coefficients). 
Thus there are continuously many, distinct, associated transcendental 
numbers 


(7) ¢@, = 2 arctan x, oo <€é < Ei. 


Select another @ = defined by y = ¢, with?# > 1, ¢ fixed. 
Now we shall prove 


THEOREM II. The continuously many rotations R, = A(¢,) B(W) A~'(¢) 
(0 < t < 1) are free generators of a free rotation group of continuous rank. 








260 J. DE GROOT 


We note that all R, are rotations with the same rotation angle y and rotation 
axes in the same plane.* In particular, the existence of a free rotation group 
of continuous rank has been established (without using the axiom of choice). 


Proof. The theorem is proved if any proper product P(R,) of a finite 
number of rotations R, is unequal to the unity matrix. After simplifications 
we may write 


(8) P(R,) = A(¢:,) BY (¥) A~*(¢:,) A(¢:,) BY (W) A“"(¢1,) - - - 


A(¢,,) B*(¥) A~"(¢..), 
the k, being integers #0, the ?, (¢ = 1, 2,..., ”) real numbers with 


<6 < 4. be FH lias (¢ = 1,2,...,%— 1). 
Now replace in the right-hand side of (8) ¥ by a real variable ¢, and 
1; 


by a multiplicity m@ of @ (m, integers > 0) with m, = m, if and only if 
t, = t,. After carrying out the simplifications 


A~*(m, ¢) A(muid) = A(lid) = A“ (6) (i, ¥ 0), 


(8) is transformed into a proper product (almost) of type (2), therefore— 
applying Lemma 2—into a non-constant matrix function of ¢. From this it 
follows that at least one of the elements ay, of matrix (8) is a non-constant 
function of 


(9) ¢,, and y, 


if we consider these for a moment as real variables. But then it is impossible 
—using substitution (7) and the result of von Neumann—that this function is 
equal to 0 or 1 if we substitute for the variables (9) their permitted associated 
transcendental and distinct values. So P(R,) # I. 


4.1. It is not necessary, of course, to take the rotation axes in the same 
plane to get free generators. The proof, just established, furnishes us a general 
method of generating free groups in the following way. Any rotation R can be 
written as a product 


cos ¢, —sin ¢, 0 1 0 0 cos ¢, —sin ¢, 0 
R=j| sin ¢, cos ¢, 0 | 0 cos ¢, —sin ¢, | sin¢g, cos ¢, 0 
0 0 1 0 sind, cos®¢@, 0 0 i 


*One might ask whether the geometrical structure of this set of rotation axes in the plane 
can be relatively simple, if we select a suitable set of continuously many values t. Indeed, it is 
possible that this set of rotation axes corresponding to the generating rotations is perfect. 
This follows easily from the fact that the set of numbers {x,} contains perfect subsets. To prove 
this, we observe that x; is a monotonically increasing function of t; thus the set of transcendental 
numbers {x,} is nowhere dense in the set of all real numbers, and is, moreover, a Gs-set; 
therefore it contains perfect subsets. 








REPRESENTATIONS OF FREE GROUPS 261 


by electing suitable ¢,, ¢,, 6, (Eulerian angles, see (1, p. 104)). Let the u, v, w 
range as real variables over certain sets, say0 < u < 1,1 <0 < 2,2 <w<3. 

Now we consider elements R corresponding with triplets (u,v, w) differing 
from each other in each of the variables u, v, w. Then the elements are free 
generators of a free group. Indeed, a version of Lemma II on any proper 
product can be applied after simplifications. 

Briefly the proper products do not vanish identically as functions, and 
cannot therefore be equal to unity for permitted values of their variables, since 
these values are algebraically independent. 


4.2. Remarks. If we consider A(¢) and B(¢) as matrix functions (the 
elements being analytic functions of the real variable @) it follows from Lemma 
Il and Theorem I that these matrix functions are free generators of a free 
group (the only constant function in the group being the unity matrix). 

In a certain analogy with the generators of Theorem II, one can also con- 
sider the family of matrix functions 


(10) A (¢a) B(W) A-"(¢a), 


the indices a ranging over a set of arbitrary potency m (Wy and ¢, being real 
variables). Therefore these orthogonal matrix functions (9) are free generators of 
a free group of rank m. Any free group can therefore be represented isomorphi- 
cally by a system of orthogonal matrix functions. Perhaps this may be of some 
use for the theory of free groups. 


5. Conjectures. One could try to prove Theorem I by the alternative 
method of expanding sin ¢ and cos ¢ in a Taylor series. 
Writing 
sinx = x + o(x?), cosx = 1 — $x? + o(x?), 


one sees easily that for small x all products of type (2) with k < 2 give a 
non-constant matrix. But this fails already in the case k = 3; taking, for 
example 


n, = 2, m2. = 3,3 = — 5, m, = — 5,m. = 2,m; = 3. 


However, for k < 3, a proof is possible if we expand sin x and cos x up to 
o(x*). It will perhaps be possible to get a proof of Theorem I using induction; 
however, the computations involved are very lengthy. On the other hand, it 
may be possible to generalize this method in cases where the generating 
rotations A and B are not perpendicular to each other. Consider 


‘ A’ = CAC with C = B(a) 


for a fixed but arbitrary a and carry out the computations for products (2), 
in which B is replaced by A’. This gives the following conjecture (generalizing 
Theorem I): two rotations with arbitrary but different rotation axes are free 








262 J. DE GROOT 


generators of a free group for all rotation angles ¢, a countable number of 
values ¢ excepted. 

We conclude with another conjecture: Let {@,} be a family of at most 
continuously many groups @,, each of which is countable (or more generally 
consists of less than continuously many elements) and can be represented as a 
three-dimensional rotation group; now the free product of the groups G, can 
be isomorphically represented by a rotation group. 


REFERENCES 


1. L. Bieberbach, Analytische Geometrie (Teubner, 1932). 
2. Kh. A. Doniakhi, Linear representation of the free product of cyclic groups, Leningrad State 
Univ. Annals [Uchenye Zapiski] Math. Ser., 10 (1940), 158-165 (Russian). 
3. D. J. Fuchs-Rabinowitsch, Leningrad State Univ. Annals [Uchenye Zapiski] Math. Ser., 10 
(1940), 154-157 (Russian). 
4. J. de Groot and T. Dekker, Free subgroups of the orthogonal group, Comp. Math., 12 (1954), 
134-136. 
. F. Hausdorff, Grundztige der Mengenlehre (1914), 469-472. 
- J. von Neumann, Ein System algebraisch unabhangiger Zahlen, Math. Ann., 99 (1928), 
134-141. 
- R. M. Robinson, On the decomposition of spheres, Fund. Math., 34 (1947), 246-260. 
8. I. N. Sanov, A property of a representation of a free groub, Doklady Akad. Nauk. SSSR 
(N.S.), 57 (1947), 657-659 (Russian). 
9. W. Sierpinski, Sur le paradoxe de la sphére, Fund. Math., 33 (1945), 235-244. 


own 


“_ 


University of Amsterdam 








ON A THEOREM OF BAER AND HIGMAN 
SEAN TOBIN 


1. Introduction 


1.1 Baer has shown (1) that if the fact that the exponent of a group is 
m (that is, m is the least common multiple of the periods of the elements) 
implies a limitation on the class of the group, then m must be a prime. Graham 
Higman has extended this result by proving (3) that for any given integer M 
there are at most a finite number of prime powers g other than primes, such 
that the fact that a group has exponent g implies a limitation on the class of 
the Mth derived subgroup. In fact, given arbitrary positive integers M and N, 
he produces, by an intricate construction, a finite group G having derived 
length M + 2 and prime-power exponent p’, such that the class of the Mth 
derived subgroup of G exceeds N, where 


pb’ -1> (- 1) A(M) 
and A(M) is an integer-valued function: 
A(O) =1, A(1l) =3, A(2) = 13,.... 


The case M = 0 is Baer’s result. 


1.2 In this paper we consider those prime powers p’ for which 
pb’ —1= (p — 1) A(M), 


in the special cases M = 0 and M = 1, that is, r = 1 and p = r = 2 re- 
spectively. We show that no result similar to that of Higman can be obtained 
in these cases; indeed, an upper bound is given for the class of the Mth 
derived subgroup in terms of the derived length. 

Specifically, the final results are as follows: 


THEOREM 1. If a finitely generated group G has exponent 4 and ¢-length i, 
then the class of ¢(G) is at least 2*-* and at most 3-*. 


The meaning of ¢(G) and ¢-length of G is explained in §2.1. 


THEOREM 2. If a finitely generated group G has prime exponent p and derived 
length d, then the class of G is at least 2*' and at most p*-'. 


Received July 22, 1955. This paper embodies some of the work carried out by the author 
fora Ph.D. thesis at the University of Manchester. I wish to express my gratitude to Dr. Graham 
Higmarn for his continued interest and advice then, and, in connection with the present paper, 
for Theorem 5.2 which is essentially due to him both in conception and in proof. 


263 











264 SEAN TOBIN 


A well-known result due to Hall (2) gives 2*-* and 2*"', respectively, as 
the lower bounds; we shall be concerned here with the upper bounds. The 
interest of these results lies, of course, not so much in the bounds given for 
the class, which are presumably far from best possible if \ and d are large, 
as in the fact that bounds exist which are independent of the number of 
generators of the groups in question. 

We may mention also an auxiliary theorem which is of interest in itself. 


THEOREM 5.2. If, for a finitely generated group G with prime-power exponent 
‘b’, there exists a positive integer s such that 


Hyi1(G) © D;(G), 


then 
Hosi(G) © Hei(D(G)), q@= 1,2,3,.... 


Here H, (G) and D:(G) are members of the lower central series and the derived 
series (§2) of G, respectively. 


2. Definitions 


2.1 The Frattini subgroup ¢(G) of a group G is defined to be the intersection 
of all the maximal subgroups of G; if P is a p-group (by which we mean that 
the order of P is a power of a prime p) it is known (2) that ¢(P) = D(P) U P? 
where D(P) is the commutator subgroup and P? the subgroup generated by 
the pth powers of all elements in P. 

The Frattini series is defined inductively: 


oo(G) = G, i41(G) = (¢:(G)), > 0. 


If this series terminates with the identity (as it certainly does for a finite 
group), so that ¢,:(G) # {1} but ¢,(G) = {1} whenever i > j, we shall say 
that the ¢-length of G is j. 

The derived series of a group G is defined inductively: 


D,(G) = G; Disi(G) = D(D(G)), i> 0. 


If N is a normal subgroup of a p-group P, such that ¢,(P) D N, it is easily 
seen that ¢,(P/N) = ¢,(P)/N. 

Since u~'v—'uv = (u—')?(uv—")*v?, we see that if P is a 2-group, then P?DD(P) 
and ¢(P) = P*. If, in addition, P is generated by elements which all have 
period 2, then D(P) D> P?, and consequently ¢(P) = D(P) = P*. 

Again, if P is a 2-group with m independent generators, P/¢(P) is elemen- 
tary abelian with order 2"; thus every factor-group ¢;(P)/¢i4:(P) in the 
¢@-chain of P is an elementary abelian 2-group (i.e., the direct product of 
cycles of order 2). 


2.2 Square brackets will be used to denote commutation: 
[x, y] = x~*y“'xy. 





mth 


~ 








ON A THEOREM OF BAER AND HIGMAN 265 


If x1, Xe, X3, . . . are arbitrary elements in a group, the (complex) commutators 
in the x, are defined inductively by the rules 

(i) x, is a commutator; 

(ii) if ¢ and d are commutators in these elements, so also is [c,d]. In 
particular, a left-normed (or simple) n-fold commutator is defined: 


[x1, XQ, --+ 4 Xn] = [[x1, Xa, +--+  * Xn] (n > 2). 


The weight of a commutator in the element x, is defined by 

(i) the weight of x, in x, is 1 ifi = 7, 0 if i # j. 

(ii) the weight of [c, d] in x, is the sum of the weights of c and d in x,. 
The weight of a commutator is the sum of its weights in the components x,. 
We recall the commutator identity 


[x, yz] = [x, 2}|x, y][x, », 2). 
The lower central series of a group G is defined inductively 
H, (G) =G; Huil(G) = [H; (G),G), i> i. 


(That is, H,,,; is the subgroup generated by the set of all commutators of the 
form [h,, g] with &, in H, and g in G.) A discussion of the properties of this 
series, and its connection with the class of G, may be found in (2). 


3. A certain group with exponent 4 


3.1 Let G be a 2-group with ¢-length 3. Let H = $2(G), and let K ~ G/H. 
It is clear that the exponent of G must be 8 or 4, and the exponent of K is 4. 
Both H and ¢(K) are non-trivial elementary abelian 2-groups, and ¢(K) ~ 
¢(G)/H. We use 1 to represent the unit element of G or K according to 
context. 

In this section we show that the requirement that G have exponent 4 
introduces a certain relation into the group ring of K over the field of two 
elements; and in §3.2 we show that this relation yields an upper bound for the 
class of #(G). 

For each element k in K we choose, in the corresponding coset of G/H, a 
coset representative g, in G. If x,y are any elements of K, the element 
h(x, y) in H is determined by the equation 


£28, = geh(x, y) where z = xy. 
The element 4, icduces an automorphism of H which depends only on k; 
if h is any element of H we denote its image by 
h* = gi" h ge. 


The automorphisms k belong to the ring 6 of endomorphisms of H, and 6 
has characteristic 2. Since (h*)” = h*, where x, y and z have the meanings 
already assigned, the subring of 6 generated by the set {k: k € K} is a homo- 
morphic image of the group ring of K over the field of two elements (0, 1). 











266 SEAN TOBIN 


Any element g of G can be written uniquely in the form g = g,h with k 
in K and hk in H. Then 


gt = guh(k?, k*) h(k, ky Oth" Roen’, 
We choose g; = 1; and now in order that G itself may have exponent 4 it is 
necessary and sufficient that 


(i) &O+* = 1 for all choices of hk in H and k in K 
and 


(ii) A(R, k*) h(k, k)“O** = 1 for all choices of & in K. 
We shall consider condition (i), which is more amenable to treatment than 
(ii). Since 
hot) = hg.*hge = [h, gel, 
what relation (i) says, in effect, is that 
[h, u,v, w] = 1 


for any element # in H, whenever the elements wu, v, w all lie in the same 
coset of G modulo H. 


3.2 Thus we consider the group ring of K over the field of two elements 
(0, 1), with the relation (1 + &)* = 0 for all elements & in K. 

We use the notation K, = 1 + &, for elements k, in K; throughout what 
follows we shall not use k without a subscript to represent an element of K. 
To avoid repetition, we make the following convention: 

k, is an arbitrary element of K 

ke = k;? 

ks, ks, kx are arbitrary elements of ¢(K) 

ky = [Ri, ks], k 6= (Ri, Rs], ks = [R:, k7}. 

Thus K? = 0 and K, K, = K, K, when i and j lie between 2 and 8 inclusive. 

The relation in the group ring may now be written 
3.21 K, K; = 0. 

If we replace k, here by k, ks, then k; is replaced by (kik3)* = Reka, 


K, becomes 1+ kik; = 1+ (1+ K,)(1+ K;3) = Ki + K;+ Kiks;, 
and the relation gives 
3.22 (Ki + Ks + KiK3)(K2 + Ka + K2K,y) = 0. 


Post-multiplication by K, gives (K, + K; oo K.iK;)K2K, = 0; then post- 
multiplication by K; gives 


3.23 KiK;3(Kz ao K,) = 0. 
This leaves (K, + K;)(K2 + K,) = 0, but KK; = 0, thus finally 
3.24 Kik, a K;(K:z oa K,) = 0. 


Using 3.23, premultiplication by K, gives 
3.25 K.K, = 0. 





1e 


t- 


— 





ON A THEOREM OF BAER AND HIGMAN 267 


In 3.24, replace k; by ksks, then k, is replaced by [k:, ksks] = Rake. Thus 
3.26 KsKe + KiKs + (Ki t+ Ka t+ Ks + KiKs)KiKs 
+ K3Ks(K. + Ki + Ko + KiKs) = 0. 


If we substitute for k, the particular value k, = [k:, kg] where kg is an arbitrary 
element in ¢(K), then 


ke = (Re, ki, k;] = (Re, k,?] — l 
and KK, = 0 by 3.25, thus 
K.iKs + K:K.iK, ~ 0, which implies K,K, = 0. 


Consequently, in equation 3.26, where k, is again an arbitrary element of 
¢(K), we have 


3.27 Kiks = 0 
and 3.26 reduces to 
3.28 K;:Ke + K.Ks + K:K;K; = 0. 


Again, in 3.28 replace ks by kskz and ke by keks. Then, since KsK, = 0 by 3.27 
K3(Ke + Ks) + Ka(Kg + Ki + KsK:) + K2K3(Ks + Ki + KsK:) = 0. 
Using 3.28 this simplifies to 
K.K5K; = K2K3K5K:. 


On multiplying 3.28 by K; and using this, we obtain K,;K.K; = 0, which is 
equivalent to saying K,K 5K; = 0. Consequently 


3.29 K.K;K;5K; = 0. 
Now if we take any element ky in $(K), then 
Ky = Ky? + Kis? +... + Kig* + (products of two or more of these squares) 
where 11, Ris, . . . , Rig are certain elements of K. Thus 3.29 implies that 
K3K5KiKy = 0 


for all choices of four elements 3, ks, kz, ks in $(K). 
In the group G this means that for arbitrary elements w, x, y, z in ¢(G) and 
h in H, [h, w, x, y, 2] = 1. Thus we have proved 


THEOREM 3.2. If G is a finite group with exponent 4 and ¢$-length 3, then the 
class of @(G) is at most 5. 


3.3 Let G be a finitely generated group with exponent 4: such a group is 
finite (5), hence a 2-group. Thus ¢(G/¢:(G)) = ¢(G)/¢3(G) and the derived 
series of ¢(G) coincides with its Frattini series. Thus from Theorem 3.2 we 
obtain, using the notation previously explained, 


Coro.vary 3.3. If G is a finitely generated group with exponent 4, 
H«(¢(G)) € D2(o(G)). 











268 SEAN TOBIN 


4. A similar result 


Meier-Wunderli has shown, in (4), that a finitely generated metabelian 
group with prime exponent p has class at most p. This may be stated as follows: 


THEOREM 4.1 (Meier-Wunderli). Jf G is a finitely generated group with prime 
exponent p, 


Hy41(G) ¢. D:(G). 


This result bears an obvious resemblance to Corollary 3.3; they will be extended 
simultaneously by means of the theorems given in the next section. 


5. Some theorems on commutators 


5.1 LemMaA. Let x1, X2,...,X, be arbitrary elements of an arbitrary group G; 
let c be a commutator of positive weight in each of the x; (i = 1,2,...,m) and let 
the equation 

c= dd....de, 


where the d, are also commutators in x, X2,..., Xn, be an identity in the group 
variables x; Then there is an equation 


c = by bo... bp 


also true for all x,, Xo, ... , X, in G, where the b's are commutators in the elements 
d;,dz,...,dq such that every b is of positive weight in each of x1, X2,... , Xn- 


Proof. This can be proved by induction on , being trivially true for n = 1. 
Thus we may suppose that each d, is of positive weight in each of x, . . . , X_—1. 
We may further suppose that the commutators of zero weight in x, are those 
in an initial segment d,d,. . . d;. For if this is not so, then using the relation 
yx = xy|y, x] we can bring them to the left of the expression one at a time, 
by a process which terminates, since the new commutators [y, x] introduced 
have positive weight in x,. 

When this has been done, let x, = 1. Then every commutator of positive 
weight in x, reduces to the identity, while those of zero weight in x, are not 


affected. Thus for all x;, x2,...,x, inG 
1 = d,d,... dj. 
Hence also, for all x;, x2,...,x, inG 


c= diss deze. ° yy * 
which is the expression required. 


5.2 THEoreM. If, for a finitely generated group G with prime-power exponent 
p’, there exists a positive integer s such that 


Hy41(G) .. D.(G), 
then 
Hei(G) © Hes (D(G)), q = 1,2,3,.... 


amir Gr 





an 
'S: 


nt 


- 


ON A THEOREM OF BAER AND HIGMAN 269 


Before proving Theorem 5.2, we make a remark which also has a bearing on 
Theorem 6.2. A well-known theorem, due to O. Schreier, states that in a 
finitely generated free group any subgroup of finite index is aiso finitely 
generated. Since any finitely generated group is a homomorphic image of a 
finitely generated free group, the statement remains true when the words 
“free group” are replaced by “group.” In particular, let F be a group with a 
finite exponent. Then if F is finitely generated, so also is D(F) and every factor 
D,(F)/Di1(F) in the derived series is finite. 


Proof. If N is any normal subgroup of G, H,(G/N) = {H,(G), N}/N; 
hence it is sufficient to prove 
Hos41(X) Cc Ho4i(D(X)) 
for the finite group X = G/D,(G) where a is chosen large enough to ensure that 


D.(G) © He+1(D(G)). 
Thus we consider a p-group X of exponent p’, generated by a minimal 


basis x1, X2, . . . , X,- A result due to Hall (2; Theorem 2.8.2) states that H,(X) 
is generated by the set of all left-normed commutators of weight >j in the 
components xj, X2,...,%,. Thus D(X) = H,(X) is generated by the simple 
commutators 

Was Bae s+ so Bel t> 2. 


But H,4:(X) C D2(X); hence the simple commutators with ¢ > s + 1 all lie 
in D(X) € $(D(X)), and can therefore (2) be omitted from any generating 
set of D(X). Thus D(X) is in fact generated by the simple commutators with 
2 <t < s; we denote these by d;, d2,.... 

If we set 


© @ Was Bap - > + s Bigsesk 
then c lies in D2(X), so 
cm dy dy... Ay. 
Consequently, by Lemma 5.1, 
c = by be... bg, 


where each 5, is a commutator in the d's and therefore also in the x’s and is 
of positive weight in X ty y= 1,2,...,98+ 1. 


Thus, in particular, each 5, is of weight at least gs + 1 in the x's. Since no 
commutator d has weight greater than s in the x's, each 5, must be of weight 
at least g + 1 in the d’s. The required result follows. 


5.3 CoroLitary. If, in addition to the assumptions of the previous theorem, 


Hy41(Ds(G)) © Des2(G) for every positive integer B, 
then 
Hyaai(G) © Dagi(G) for any pscitive integer d. 











270 SEAN TOBIN 


Proof. H4:(G) © D2(G) gives a basis for induction on d. We assume that 
the statement is true for an integer d. Then 


Hy+e41(G) © Hyay;(D(G)) by Theorem 5.2, 
C Das:(D(G)) by the induction hypothesis, 
since D(G) itself satisfies the conditions stated for G. 


6. Final results 


All that remains now is to apply 5.3 to the groups considered in 3.3 and 4.1. 
If G is finitely generated with exponent 4, so also are its successive Frattini 
subgroups; and 


Ds(o(G)) = $8(¢(G)) = o(¢8(G)); 
aus by Corollary 3.3, 
Hs41(Ds(o(G)) © Da+2(o(G)). 
Corollary 5.3 now gives 
THEOREM 6.1. Jf G is a finitely generated group with exponent 4: 
Hd+1(¢(G)) & oaya(G). 


Again, if G is finitely generated with prime exponent ?, so also is Ds(G), and 
by Theorem 4.1, 


Hy+1(Ds(G)) & Dp+2(G). 
THEOREM 6.2. If G is a finitely generated group with prime exponent p: 
Hya41(G) © Days (G). 


REFERENCES 


. Baer, The higher commutator subgroups of a group., Bull. Amer. Math. Soc., 50 (1944), 
143-160. 


Hall, A contribution to the theory of groups of prime power order, Proc. Lond. Math. Soc., 

386 (1933), 29-95. 

. Higman, Note on a theorem of R. Baer, Proc. Camb. Phil. Soc., 46 (1949), 321-327. 

. Meier-Wunderli, Uber endliche p-Gruppen deren Elemente der Gleichung x? = 1 genilgen, 
Comment. Math. Helv., 24 (1950), 18-45. 

5. 1. N. Sanov, Solution of Burnside’s Problem for exponent 4, Leningrad State Univ. Annals, 

Math. Ser., 10 (1940), 166-170. 


nN 
. 


>» & 
mo Ww BF 


University College, Galway, Ireland 





ni 


C., 


Tt, 


ree 


ee ~ ere 


ON COMMUTING RINGS OF ENDOMORPHISMS 


C. W. CURTIS 


1. Introduction. Various problems concerning the general theory of 
centralizers of modules which are not assumed to be completely reducible have 
been discussed by Fitting (3), Brauer (2), and Nakayama. In this paper we 
present a new approach to some of these questions, which has its origin in 
Weyl’s discussion (15) of the centralizer of a finite group of collineations. 

Let % be a ring with an identity element, and let D?’ and M be unital! left 
and right S-modules, respectively. We assume that there exists a function 
r(y,x) on M’ X M— B which is bilinear with respect to B, and non- 
degenerate. The set 6 of all finite sums }-r(y,, x,) is a two-sided ideal in 8, 
called the nucleus of the pairing (Dt’, M, 7). Let € be the ring of all B- 
endomorphisms of J2. Then € contains the right ideal DY’ OM consisting of all 
finite sums of the endomorphisms ¥ © u of MM, where x(y © u) = ur(y, x), 
x € M. By a centralizer € of M relative to @ we mean a subring € of € 
containing the right ideal PD?’ © M. 

Our basic assumption is that the nucleus 6 contain a two-sided identity 
element. Then it is proved in §5 that the ring of €-endomorphisms of M is 
precisely the set of endomorphisms R,: x —» xb determined by the elements of 
B. Let R be a C-direct summand of I; then r(M?’, RM) is a left ideal in 6, and 
the mapping ® — 7(M’, R) is a (1-1) mapping, preserving direct sums, 
intersections, and isomorphism relations, between the set of €-direct sum- 
mands of I and the set of left ideal direct components of b. Dually, if Pt’ © M 
contains the identity operator on M, and if the pairing y © u is non-degenerate, 
then the mapping R — PM’ © R defines a (1-1) mapping between the set of 
$-direct summands of M and the set of left ideal direct components of the 
centralizer €. If SB satisfies the minimum condition for left ideals, then every 
indecomposable G-direct summand ® of M contains a unique maximal 
€-submodule, and if R; and R: are indecomposable €-direct summands, then 
R: and KR, are C-isomorphic if and only if R:/S; and R:/S: are C-isomorphic, 
where ©, is the unique maximal submodule of ®,, i = 1, 2. 

The principal application of this theory is to projective (or ray) representa- 
tions of a finite group @ by s.l.t. (semi-linear transformations) of a vector 
space 9? over a division ring A. If B = A(G, H, p) is the crossed product 
associated with the projective representation, then it is proved in §2 that a 
space 9’, and a pairing r of Mt’ X M— B which satisfies our hypotheses, 

Received August 27, 1955. Presented to the American Mathematical Society August 30, 
1955. (The author is a National Research Fellow.) 


1A left or right B-module M is called unital if the identity element of B acts as identity 
operator on J. 


271 











272 Cc. W. CURTIS 


can be constructed if and only if the normalized factor set p satisfies the 
condition p,,,~' = 1 for all s in G. In §3 the pairing considered by Weyl (15) 
is defined, and shown to satisfy our hypotheses, so that Weyl’s results are 
consequences of the theorems proved in §5. In §4 and §8 some special results 
are derived which concern the pairings obtained in §2 from projective repre- 
sentations of finite groups. A few remarks are included in §9 on the applications 
of the results on projective representations to the Galois theory of primitive 
rings with minimal ideals. A direct proof is given in §10 of the fact that the 
centralizer of a symmetric algebra & of |.t. in a finite dimensional vector space 
’ M which is a projective A-module is a symmetric algebra. 


2. Projective representations of finite groups’. Let It be a commuta- 
tive group, and A a division ring consisting of endomorphisms £: x — xt of M, 
such that A contains the identity mapping. Then MM is a right vector space 
over A. Two non-singular s.1.t. 7, and T, in I over A are said to be equivalent 
if 7, = T., where yu is a non-zero element of A. An equivalence class {7} of 
non-singular s.l.t. is called a projective transformation. Multiplication of 
projective transformations is defined in the obvious way, and the projective 
transformations form a group $(M, A). 

Now let @ = {1,5,t,...} be a finite group. A homomorphism of © into 
$(M, A) is called a projective representation of G. Evidently a projective 
representation is determined by a mapping s — 7, of @ into the set of non- 
singular s.1.t. of M such that 


(1) T,T, = Ts: Pst 


where the p,, are certain non-zero elements of A. From the associative law 
and (1) we obtain 


(2) Pi = pee ps 

where 3:  — £ and / are the automorphisms of A determined by the s.I.t. T, 
and 7, and 

(3) Ps. tu Ptu = Potu Pot 


If we denote the inner automorphism £ — p,, ;~'tp,,, by p,,:, then (2) becomes 
(2’) at = st py. 

A set {p,,:;%}, where the p,,, are non-zero elements of A, and the @ are 
automorphisms of A, is called a factor set of @ (in A) if the equations (2) and 
(3) hold. Thus the transformation 7, satisfying (1) determine a factor set 
{ps,2; @}. If we replace the representatives 7, of the projective transformations 
corresponding to the elements of G by new representatives T’, = T,n,, then 
we obtain 

i = , of és 


*For the terminology introduced in the first part of this section, see (6, Chap. 4, §17, 18). 


-_—— 


ee 








i ee ee 


bl cl 


7—— Oe _ 





COMMUTING RINGS OF ENDOMORPHISMS 273 


where the automorphisms 3%’, associated with 7”, satisfy 
(4) v= 5 ™,, 
where 7, is the inner automorphism § — yu,~'éu,, and 


° -1 T 
(5) P's. = Mes Po.t be Me 


Thus it is natural to say that two factor sets {p,,,;@} and {p’,,;@'} are 
equivalent if there exist elements 4, ~ 0 such that (4) and (5) hold. Then a 
projective representation determines a class of equivalent factor sets. 

Now let {p,,,; @} be a factor set, and let {b,} be a set of elements in (1-1) 
correspondence with the elements in @. The set % of formal expressions 
+).£,, &, € A, s € G becomes an associative ring if we define two expressions 
to be equal if and only if they have the same coefficients, and if addition is 
defined componentwise, and multiplication using the distributive laws and 
the rules 


b,b, oan bs: Ps.ts 
tb, = b,&. 
Then & is called a crossed product A(G, H, p) with correspondence s + § = s*, 


and factor set p. If {p’,.,; @’} is a factor set equivalent to {p, ,; @}, and if B’ 
is a crossed product A(G, H’, p’) with correspondence 


sas? = 3 


and factor set p’, then it is easily verified that 8 and 8’ are isomorphic. 

As Jacobson observes (6, p. 82), the element 5,9; :~' is an identity 1 for B, 
and if we identify A with the division subring 1A of %, then every element of 
% can be expressed uniquely in the form >> b,,, where the term 5,£, is now the 
product of b, = 6,1 with £,. It follows that % is a two-sided vector space of 
finite left and right dimension over A, and consequently 8% satisfies both chain 
conditions for left and right ideals. 

If s+ T, defines a projective representation of G with correspondence H 
and factor set p, then 


} a b,t,— i T.é, 


defines a representation of 8 by endomorphisms of the representation space 
M such that the identity element of B is mapped onto the identity mapping 
in M, while conversely any such representation of B by endomorphisms of JM 
gives rise to a projective representation of @ with the same correspondence 
and factor set. 

Now let Mt be a unital right B-module, and hence, in particular, a right vector 
space over A. Let Dt’ be a left vector space dual to M with respect to a non- 
degenerate bilinear form (¥,x) on Mt’ X M-— A, such that the s.l.t. R,: 
x —> xb, determined by the elements of G all have transposes R,* relative to 
the form (yj, x). Thus R,* is a s.1.t. of DY with automorphism s-' such that 











274 Cc. W. CURTIS 


(if we write operators on QP’ to the left), 


(6) (¥, xR)" = (Ry ¥, 2) 
for all ¥ and x. 
We prove first that if we set (> },t,) ¥ = > R,* (EW), then M’ becomes a 
unital left B-module. For all x and y, we have, since 1 = by ;,:~', 
(ly, x) = (Ry (01,1 ¥), =) - (ora, xR)" = (y¥, x) 


by (2’), and hence ly = y. In order to prove that 2?’ is a left B-module, it is 
‘sufficient to prove that (ab)y = a(by). For all x and ¥, we have 


“ - 
((b.8bm) ¥, x) = (Ree (02.60), x) 
= (6.8'n¥, xRai) | = (ny, xR.R YP 
* * 47 3 tpe.e~ 1 3t- 
= (R, (Ri (¢'ny)), x)? 4" = (bby), =) 
by (2’), and the conclusion follows from the non-degeneracy of the form. 
We wish to study the centralizer of 2 relative to B. Neither the centralizer, 
nor the projective representation corresponding to J, nor the crossed product 
% = A(G, H, p) is changed if we change the basis (b,) of B to (b,u,), where 
the u, are non-zero elements of A. In particular, if we set 4; = pi; and uw, = 1 if 
s #1, then the equivalent factor set {p’, ,.;@’} corresponding to the new 
basis (b,u,) has the property that p’;, = 1, and by an application of (3) 
(see [6]) it follows that p’;,, = p’,, = 1. There is no loss of generality in 


assuming that our original factor set is normalized in this way, and in the rest 
of the paper, this normalization will be tacitly assumed. 


PROPOSITION 1. The mapping 
(7) r(¥,2) = 2 bv, xR)", 
on I’ X M— B is homogeneous, in the sense that the equations 
(8) (by, x) = br(y, x) and r(y, xb) = r(y, x)d, bEs 


hold, if and only if the (normalized) factor set of B satisfies the condition p, ,-1 = 1 
for all s in ©. 


Proof. In the proof of this result, we shall use the abbreviation wu’ for u-', 
u € @. It is an easy matter to verify that the equations (8) hold if } is an 
element of A. From (7) it follows that r(y, x) is biadditive, and consequently 
the homogeneity is equivalent to the equations 


(8’) T(dy, x) = byr(y, x), rly, xd,) = rly, x)d,, u€ G. 
The coefficient of b, in r(b,y, x) is 
(Ry ¥, xRyy = (¥, cRyveprn) _ (y, Rea) pew 








—— 





COMMUTING RINGS OF ENDOMORPHISMS 275 


The coefficient of 5, in byr(y, x) is 
Puan’ AV; xRy.)" = Pun’ Pu’. KV, Ry)” "et.s 
= A, Paw V; xR Pe eat OO = (y, Ryu) paw’, t 


by (2’) and (3), and the facts that p:,, = 1, and a’ = @’p, , by (2’). Thus the 
first equation in (8’) holds if and only if 


(9) Pre = Pu Pe. 
The coefficient of b, in r(y, xb,) is (¥, xRyvp,.v)', while the coefficient of 5, 
in r(y, x)d, is 
Pa’ ul¥, xReyv)™” ‘= Pw uP’ wah; *Ruv)' pwn 
by (2). Hence the second equation in (8’) holds if and only if 
(10) pu. = Pw’ .u 
Setting ¢ = u in (10) we obtain 


7 
Puu’ os Plu = 1, 


and hence p, = 1, so that the condition is necessary. 
Assume now that p,,, = 1 for all u. By (3) we have 


(11) 1 = pose. = Pur. Poe 
and 


1 = 91, Pwi.u = Pu’.uePu,e 
Upon substituting u — tu’ and v — u in the last equation we obtain 


(12) Our Pw’ u = 1, 
and by comparing (11) and (12) we obtain (10). The condition implies that 


u’ = i’, and we have 


Pv ube’. = Put. ’Pru = 1 

by (10) and (12), proving (9). This completes the proof. 

For an example of a projective representation whose factor set satisfies the 
condition of Proposition 1, but is not equivalent to one, see (17, p. 182). 

The pairing r(¥, x) defined in (7) is mon-degenerate in the sense that 
7(M’,x) = 0 implies x = 0, and r(y, M) = O implies y = 0. This remark 
follows from the fact that r(Q’, x) = 0 implies (D’,xR:) = (M’, x) = 0 
since R; is the identity operator, and the non-degeneracy of the form (y, x). 

An endomorphism C of M is said to belong to the centralizer € of M relative 
to B if (xb)C = (xC)b for all b in SB, x in M, and if there exists an endomor- 
phism C* of Dt’ such that (C*y, x) = (W, xC) for all x and y. An element of 


€ is necessarily a |.t. in M over A, and it follows that C*, which is uniquely 
determined, is also a I.t. in QD’ over A. 











276 Cc. W. CURTIS 


PROPOSITION 2. An endomorphism C of It is an element of € if and only if 
there exists an endomorphism C** on IN’ such that r(C**y, x) = r(W, xC) for 
all x and y. 


Proof. If C € € then evidently the transpose C* of C relative to the form 
(¥, x) satisfies the equation r(C*y, x) = r(W, xC). Conversely, if C** is given, 
then upon comparing the coefficients of b;, we obtain (C**y, x) = (W, xC). 
For all — € A, 


t(y, (x§)C) = r(C**Y, x) E = r(y, (xC)£), 


and by the non-degeneracy of the form r, C is linear. Similarly C** is a L.t., 
and hence C** is the uniquely determined transpose of C relative to the form 
(¥, x). Then for all s in @, comparison of the coefficients of b,-: yields 
(C**y, xR,) = (Ww, xCR,), and hence (y, xCR,) = (W, xR,C) so that CR,=R,C 
since 2?’ and M are dual. It follows that C is a B-endomorphism of M, and the 
proof is complete. 


We shall call the system (Q?’, M, +) a pairing in case r is bilinear and non- 
degenerate (r is bilinear if r is biadditive and homogeneous relative to right 
and left multiplication by elements of 8). Necessary and sufficient conditions 
for the bilinearity of r are given in Proposition 1. From the bilinearity of r 
it follows that 


56 = 7(M’, M) = {> T (Wi, X4) lv EM’, x, € M} 


is a two-sided ideal in 8, which we shall call the nucleus of the pairing. 
With a pairing (Q’, M, +), we shall associate a dual pairing (yy, u) ~y © u 
of Mt’ K M— C, where y © wu is the endomorphism of M defined by 


(13) x(¥ © u) = ur(y, x), xe M. 
It is easily verified that if (¥ © u)* is the endomorphism of MQ’ defined by 
(14) (¥ © u)*o = r(¢, u) y, oe M, 


then r(¢, x(¥ © u)) = r((¥ © u)*¢, x), and by Proposition 2, it follows that 
the mappings y © wu are in ©. The action of € upon Jt makes Mt a right 
€-module, while Dt’ becomes a left €-module if we set Cy = C*y, where C* 
is the transpose of C relative to the forms +, and (y, x). It is immediate that 
the pairing ¥ © u is bilinear, that is, it is biadditive, and 


CWwOuy=(CY)Ou; WOuC=yOul, CEE. 


A sufficient condition that ¥ © u be non-degenerate is that x ~ 0, y + 0 
imply xb ~ 0, by + 0, where b = r(M’, M) is the nucleus of the original 
pairing. Indeed, suppose that ¥y © M = 0. Then 


Miy OM) = Mir(y, M) = 0. 
Therefore r(by, M) = 0, and by = O by the non-degeneracy of r. Therefore 





al 


re 


I 


COMMUTING RINGS OF ENDOMORPHISMS 277 


¥ = 0. Similarly t’ © x = 0 implies x = 0. We have proved the following 
result. 


PROPOSITION 3. Let (M’, M, 1) be a pairing. Then (py, u) + © u defines a 
pairing of Mt’ X M— C which is bilinear. The pairing ¥ © u is non-degenerate 
if x # 0,¥ ¥ O imply xb ¥ 0 and by ¥ 0, where b is the nucleus of the pairing r. 
The set c = M’ © M consisting of all finite sums FY, © u, is a two-sided ideal 
in &. 

The mappings ¥ © u belonging to the nucleus of the pairing defined by (13) 
can be characterized quite simply if we use the formalism of finite valued Lt. 
(8, Chap. VIII). Every finite valued |.t. X in MM over A which possesses a 
transpose X* relative to (Wy, x) can be expressed in the form 


X= Dw X uy, v¥, € DM, u,€ M, 
where x(>> ¥; X us) = Y uci, x), x € M. We wish to prove the formula 
(15) YOu = DR,-i(v X u)R,. 


We have for all x, 


x >> Re-i(v Xu) R, = DO uly, xR) R, = DY wR, xR,-1) 


ur(y,x) = x(WO© u). 


Various special cases of the situation considered in this section are of 
importance. We should like to mention especially the applications to affine 
representations of finite groups (6, p. 81), where all p,,, = 1, and consequently 
the pairing r is bilinear in all cases, by Proposition 1; and to ordinary repre- 
sentations of groups, where all p,, = 1, all § = 1, and A is a field. 


3. A pairing constructed by Weyl. We shall discuss a pairing introduced 
by Weyl (15) which differs from the one we have defined in §2 in that its 
bilinearity depends upon the existence of an involution in the crossed product 
%. We consider an affine representation s — U, of a finite group @ by s.Lt. 
in a vector space I? over a field #; then all p,,, = 1, and s +8 is a homo- 
morphism. Let 8 = ©(G, H, 1) be the crossed product constructed as in §2. 
In this case we have b,b, = b,,, and &, = b,¢*, § € &. Since @ is commutative 
it follows that the mapping 


ID bits > DS E sds 
is an involution in 8. We obtain a representation of 8 by endomorphisms of 
M by setting 
xU(> dE) = DL .(xU,)és. 
Then M becomes a left B-module (and a left vector space over ) if we define 


bx = xU(b’), x € M, b € B. The right vector space M* of all linear functions 
on 92 becomes a right B-module if we define 











278 Cc. W. CURTIS 


(> be.) = ¥ wu'O.-ve, v EM, 
where 
U" (b,-1) 


is the transpose of the s.1.t. U(0,). 
We introduce a pairing ¢ on It KX M* — B by means of the following 
formula: 


(16) o(x,¥) = 2.b,(xU,, ), 


where (x, ) is the bilinear form on I? K M* — ®. It is not difficult to verify 
that ¢ is bilinear: 


a(x, + x2, ¥) = o(x1, 0) + o(x2, p) 
a(x, vi + v2) = o(x, i) + o(x, v2), 
a(bx, p) = bo(x,p), o(x, pb) = a(x, p)d, bE B, 


and that o is non-degenerate: o(M, ¥) = 0 implies y = 0, and o(x, M*) = 0 
implies x = 0. 

Let € be the ring of B-endomorphisms of M. If C € G, then C isal.t. and, 
if C* is the transpose of C with respect to the form (x, ¥), then 


a(xC, ¥) = o(x, yC*) 


for all x and ¥. Conversely if C is a endomorphism of 9, and if there exists 
an endomorphism C** of I2* such that o¢(xC, ¥) = (x, yC**) for all x and y, 
then C** is also the transpose of C with respect to the form (x, ¥), and C € €. 

The endomorphisms ys u defined by x(y* u) = o(x, ¥)u are elements of €. 
If we introduce the action of € upon P* by means of the formula Cy = yC*, 
then I2* becomes a left €-module, and (y, wu) — ¥* u defines a bilinear pairing 
of M* K M — GC. Finally it is possible to verify, as in §2, that for all y and u, 

yeu = 7 V,-i(v X u) U,. 


4. Remarks on the structure and representation theory of crossed 
products. Let $ = A(G, H, p) be a crossed product. We shall prove that 
there exists a (1-1) order inverting correspondence between the lattices of 
left and right ideals of 8. Let r(S) and /(G) denote the right and left annihila- 


tors, respectively, of an arbitrary subset S of S. If r and [ are left and right 
ideals, respectively, then r({) and /(r) are right and left ideals, respectively. 


Proposition 4. Jf 8 = A(G,H, p), then the correspondences t — I(t) and 
1—r(l), where t and | are right and left ideals, respectively, are inverses of each 
other: r(i(t)) = t and I(r(l)) = t. Moreover, every indecomposable right or left 
ideal direct component of B contains a unique minimal non-zero subideal. 


Proof. Since A C %, % is a two-sided vector space over A, and the elements 
{b,, b,,...} corresponding to the elements of @ form both a left and right 





COMMUTING RINGS OF ENDOMORPHISMS 279 


basis of 8 over A. If b = ¥ 5,£, is an arbitrary element of 8, then the mapping 
b—r(0) = & 


is both a left and right A-linear function. It is easy to prove that the kernel 
of \ contains no non-zero left or right ideal of B (12, p. 658). Therefore the 
associated bilinear form \ defined by 


(17) A(d, b’) = (dd’), b,b°€ B. 


is non-degenerate. From these facts it follows that if r and [ are right and left 
ideals, respectively, then 


Ur) = {b|b€ B, A,r) = 0}, r(t) = {6|b€ B, A(t, 5) =O}. 


Since % is finite dimensional over A, a well-known property of dual vector 
spaces implies the first statement of the theorem. 

Now let eB # 0 be an indecomposable right ideal, where e¢ is an idempotent. 
Since % satisfies the minimum condition for left and right ideals, eB contains a 
unique maximal subideal. Moreover Se is an indecomposable left ideal which 
also contains a unique maximal subideal (1, Chap. IX). Clearly /(eB) = B(1—e). 
Suppose that for some x € Be, A(x, eB) = 0. Then, since xeB is a right 
ideal, we have xeB = 0, and x € B(1 — e). Therefore x = 0, and it follows 
that the restriction of \ to Be X eB is non-degenerate. Because of the order 
inverting property of the annihilator correspondence, we conclude that both 
Be and eB possess unique minimal non-zero subideals. 

We remark that % is a quasi-Frobenius ring (10, p. 8) by Theorem 6 of 
(10). 

Now we consider a pairing (Qt’, M, r) of M’ K M— GB (see §2), together 
with the associated pairing (M’, M, ©) defined by (13) on M’ K M to the 
centralizer € of M relative to B. Letc = MP’ © Mt be the nucleus of the pairing 
(M’, M, CO); then ¢ is a two-sided ideal in €. We shall prove that the statement 
¢ = is equivalent to certain structural properties of It viewed as a B-module. 
Later, in §8, we shall show how, when € = c, these properties of J? can be used 
to prove certain ideal theoretic results concerning the ring €. 

The results we require have been established recently by several authors 
(4; 5; 9), and it is unnecessary to include the details here. Let us assume that 
the (right) dimension of I? over A is finite; then every |.t. X in M over A has 
the form X = > ¥; X u,, for some y, in PY and u, in Mt. Our starting point 
is the observation ((15),§2) that ¢c = € if and only if there exists a l.t. X in M 
over A such that 


(18) > RXR, = 1, s €@, 


where | is the identity |.t., and R, is the mapping x — xb, in M. 
Now we adopt some terminology due to Cartan and Eilenberg. A (right) 
B-module is called projective’ (My in the sense of Gaschutz (4; cf. also 5 and 


*No connection between projective representations and projective modules is implied by 
this definition. 








280 Cc. W. CURTIS 


7) if whenever T and Ul are B-modules such that Ul C T and T/U =] M, then 
there exists a 8-submodule U* of IT such that T = U @ U*. M is called 
injective (M, in (4, 5, 9)) if whenever I is B-isomorphic to a submodule 
% of Z, then there exists a B-submodule B* of T such that T = B® B*. 


PRroposiTION 5. Let (M’, M, +) be a pairing of M’ X M— B = A(G, H, p), 
and let the (right) dimension of It over A be finite. Then the following statements 
are equivalent. 

(i) ¢ contains the identity 1.t.; 
(ii) M is a projective B-module; 

(iii) M is an injective B-module; 

(iv) M is a direct sum of indecomposable B-submodules which are B-isomor- 
phic to right ideal direct components of B. 


Proof. Theorem 1 of (9) states that (ii) and (iv) are equivalent (see also 
the remark on p. 107 of (9)). The equivalence of (i), (ii), and (iii) has been 
proved by Kasch (5, Theorem 12). To verify this statement, the following 
remarks may be helpful. We should observe first that $8 is a Frobenius exten- 
sion of A with Frobenius homomorphism } — d(d) (5, p. 462). Then statement 
(i) is equivalent to the statement that (18) holds for some I.t. X, where we 
note that {b,, b,, b,,...} and {b;, b,-1, b,-1, ...} are orthogonal left and right 
bases of $ over A (5, p. 457) with respect to the bilinear form \(0, 6’) defined 
by (17). We now see that Kasch’s theorem is indeed applicable to our situation. 


Remark 1. If it is not assumed that the dimension of JU over A is finite, 
then not every I.t. X in M over A has the form }-y xu,. The following implica- 
tions remain valid: (i) — (18) — [(ii) and (iii)] — (iv). 


Remark 2. Assume (i); then from (18) we obtain 
>¥ 1,-:X°L, = 1, 


where 1 is now the identity mapping on 2’, X* is a |.t. on M’, and L, is the 
mapping ¥ — by = R,*y in M’. Therefore we have the implications (i) — 
(ii)’ — (iv)’, where (ii)’ and (iv)’ are obtained from (ii) and (iv) by replacing 
M by M’, and “right” by “‘left’’ in (iv). 

Remark 3. It follows from the considerations of §3 that a result analogous 
to Proposition 5 can be established for the pairing o of I K M* — B which 
was constructed in §3. We shall not include the details of this discussion. 


5. Abstract theory of regular pairings. Let % be an arbitrary ring with 
identity element 1, and let S$ admit a set of Qof (left) operators. We shall assume 
that 1 acts as the identity operator on all 8-modules which we shall consider. 
Let M2’ and M be left and right B-Q-modules, which are paired to B by a 
function r(¥, x). We assume that r is bilinear, relative to both % and Q, in the 
sense that the equations 











o 





. a 


COMMUTING RINGS OF ENDOMORPHISMS 281 


r(vr + v2, x) = r(y, x) + 7 (2, x), r(y, xy + X2) = r(y, *1) + r(y, X2) 
7 (by, x)= bry, x), r(y, xb) = r(p, x)b, 
t(ap,x) = ar(y,x), 1(¥,ax) = arly, x) 


hold for ali x in M, ¥ in M’, b in B, and a@ in Q. Our second assumption is 
that r is non-degenerate. If these conditions are satisfied, then we shall call the 
system (Q’, M,7r) an (abstract) pairing. The nucleus 6 = r(M’, M) of the 
pairing is a two-sided ideal in 8B. 

We let € be the set of all 8-Q-endomorphisms of M. If & denotes the endo- 
morphism x — ax of J? determined by an element of Q, then aE € € for every 
E in ©, and aE = Ea, so that if we define aE = &E, then € becomes an 
0-ring. 

The endomorphisms ¥ © u defined by (13) are elements of €, and possess 
transposes relative to the form r. Let ¢ be the subgroup of € consisting of all 
finite sums of the y © u. If the action of ¢ upon QP’ is defined by the formula 
Ey = E*y, for E in ¢, then it follows that (y, u) ~y © u is a c-Q-bilinear 
mapping of Mt’ X Pt—>c. We shall denote this pairing by (M’, M, ©), and 
observe that the nucleus ¢ is an Q-subring of €. If E € &, then (Wy O uw)E = 
v © uE, and hence ¢ is a right Q-ideal in €. We shall denote by € an arbitrary 
Q-subring of € such that 


(19) cC€cGE. 


Then € will be called a centralizer of M relative to B, and will remain fixed 
throughout the discussion. Our aim is to establish relationships between the 
nuclei 6 and c of the rings 8 and G, and the properties of M and P?’ as B and 
€-modules. 

In order to discuss the connection between the ring 8 and the structure of 
M (or Mt’) asa C-module, we shall assume that the pairing r is regular in the sense 
that 6 contains an element ¢9 = }-r(y¥*;, x*,) such that beo = eob = b for all 
b € b. By the non-degeneracy of r it follows that xep = x and ew = y for all 
x€ Mand y € M’. 

It is always possible to construct a regular pairing from an arbitrary one. 
Let eo be any central idempotent contained in the nucleus 6 of a pairing 
(M’, M, +), or let eg = O if b contains no central idempotent. Then 


MN = Meo ® M1 — €0), MN’ = eo MN’ ® (1 _ Co) M’, 


where the direct summands are invariant relative to both 8 and c. We 
define a new pairing ro of ect’ K Meo — B by setting 


to(eo, Xo) = reo, xo) 


for all ¥ and x and we shall prove that ro is a regular pairing. The nucleus bo 
of ro contains éo, for if e9 = > (Ws, x,), then e9 = Sor (eos, x eo). Obviously 


€ob = beo = b, b € Bo. 


The bilinearity of r» is evident. It remains to prove that ro is non-degenerate. 











282 Cc. W. CURTIS 


Suppose ro(eop, x¢o) = O for all eop € eoM’. If ¥ is arbitrary in M’, we write 
v = ew + (1 — coy 


and obtain 


T(W, xeo) to(eop, x€o) + 7((1 — eo)¥, xeo) 


r((1 -_ €o)¥, Xlo)eo = éor((1 -— €o)¥, Xo) = 0, 


so that xeo = 0 by the non-degeneracy of r. Similarly 7o(eop, Meo) = 0 implies 
ew = 0. 

We return now to our assumption that the pairing is regular. If S is any 
subset of %, we shall write S, (resp. S,) for the set of endomorphisms R,: 
x— xs (resp. L,: ¥ — sy) of M (resp. M’) determined by the elements of S. 
We are in a position to prove the following result: 


THEOREM 1. Let B be the set of all €-endomorphisms of M. Then b,=B,=Q. 


Proof. Obviously b, C 8, C B. Conversely let B € B; then B(y © u) = 
(¥ © u)B for all y and u. Consequently 


ur(y, xB) = (ur(y, x))B 
for all x, y, and u. Let b = }r(y*,, x*,B); then for all u € IM we have 
ub = ¥ ur(y*,, x*,B) = (udor(W*,, x*,))B = (ueo)B = uB, 

and R, = B. This completes the proof. 

If R is a €-2-submodule of M, then 

7(M’, R) = (Le ra x0) We € Mx, € R} 

is a left Q-ideal contained in b. If I is a left 2-ideal in B, then Mi is a C-0- 
submodule of I. We have, for all R and I, 
(20) Mr(M’, R) C R; 7(M’, MI) CI: 


the first since Pr(M’, R) C RIM’ OM) CRECCR by (19) and the 
fact that ® is a €-submodule* of I; the second, obvious. For later use we 
observe also that 


(21) (DM, ER) = Dr(M', R,), N(XL) = ¥ (M,), 
and 
(22) 7(M’, RI) = 7(M’, RI, Mille) = (Mli)le. 


LemMMA 1. Let R be a G-direct summand of I. Then there exists an idempotent 
e € B such that r(M’, R) = Be. 


‘For the rest of §5, 6, and 7, we shall omit explicit reference to the set 2. Thus by submodule, 
ideal, etc. we shall mean 2-submodule, 2-ideal, etc. 


—- —- 





—  —— -- 


COMMUTING RINGS OF ENDOMORPHISMS 283 


Proof. Let E bea projection of M upon R such that E € B. By Theorem 1, 
E = R,, where 


e= ry: x* ,E) c 7(M’, ®). 


If b = ¥r(¥,, x,) is an arbitrary element of 7(M’, R), then be = b since the 
restriction of E to ® is the identity mapping. Therefore r(M’, R) = Be. 


LEMMA 2. Let R be a €-submodule such that r(M’, R) = Be, where e is an 
idempotent in B. Then Mr(M’, R) = R. 


Proof. By the non-degeneracy of r we have x = xe € Ptr(M’, R) for all 
x € ®R, and together with (20), this proves the Lemma. 


LemMMA 3. Let | = Se, where e? =e € b. Then Mi = Me is a C-direct 
summand of M, and r(M’, Ml) = 1. 


Proof. We have Ml = MBe = Me, and M = Me S M1 — e), proving 
the first statement. For the second, b € I, b = }-r(y,, x,), implies 


b = be = “Uri, xe) € r(M’, MH, 
and by (20) we infer that | = r(M’, Mi). 


THEOREM 2. Let (M’, M, +r) be a regular pairing with nucleus b. The mappings 
[— Mil and R — r(M’, R) between the set of left ideal direct components of b 
and the G-direct summands of IN are inverses of each other. The mapping | — Mi 
preserves sums of arbitrary ideals, and intersections of left ideal direct components 
of B. Two left ideal direct components |, and |, of 6 are B-isomorphic if and only 
if Mi, and Mi, are C-isomor phic. 


Proof. The first statement follows from Lemmas 1-3. By (21) the mapping 
1 —» Ml preserves sums. The statement concerning intersections is an immed- 
iate consequence of the fact to be proved next, that if | is a left ideal direct 
component of 6 then 
Mi = {x |r(M’, x) CI}. 


Let | = Ge, where e? = e € b. Then 7r(M’, x) C1 implies r(y, xe) = 1r(y, x) 
for all y € M’, and by the non-degeneracy of +, x = xe € Mil. Conversely 
x € Mi implies xe = x, and 
7(M’,x) = r(M' xe Cl. 
Let Se, and Be, be B-isomorphic; then there exist elements a and 5 such 
that 
Bea = Ber, Bes = Bei, cab = c, c€ Be, 


dba = d for all d € Bes. One verifies easily that xe; — xe,a and xe, —> xed 
are €-homomorphisms between Me, and Me, which are inverses of each other, 
and consequently Me, and Mes are C-isomorphic. 

Conversely let x — x” be a €-isomorphism of ®; onto R». Define 


h: Cran x) @ Irs x?) 











284 Cc. W. CURTIS 


of r(M’, R1) into r(M’, R:). In order to prove that r(M’, Ri) and r(M’, Rz) 
are B-isomorphic, it is clearly sufficient to prove that A is a 8-homomorphism 
onto. If 


Xr x) = 0, x, € Ri, 
then 0 = M(Sor(vi, x) = } xe(vs O M), and since A is a C-isomorphism, 
Lxi(¥i OM) = L Mrs, x") = 0. 
Since ¢5 = >-r(y¥*;, x*,) is a left identity element in 6b, we have 
Urry. x") = Leor(ys, x") = 0. 


Thus h is single valued. The fact that it is onto, and is a 8-homomorphism 
can be checked in a similar way using the properties of +r. This completes 
the proof. 


Coro.iary. A left ideal direct component | of b is indecomposable if and only 
if Ml is an indecomposable direct summand of M. 


Proof. Let { be a decomposable direct component of 8: { = [,; @ ly. Then 
by the theorem Pl = Ml, @ Mile, where neither component is zero. The 
converse is proved similarly. 

Let us denote by c* the set of transposes relative tn r of the elements of c, 
and write %, and b,, respectively, for the sets of endomorphisms y — by 
determined in J’ by the elements of B and b. Let &* be the set ofall B-0- 
endomorphisms of Qt’, and let €* be an arbitrary ring of 2-endomorphisms 
of 9’ such that c* C €* C G&*. We shall write B’ for the set of €*-endo- 
morphisms of J’. Then we may state the following duals to Theorems 1 
and 2. 


THEOREM 1’. Let (M’, M, 1) be a regular pairing with nucleus b, and let 
@* be an Q-subring of E* containing c*. Then b, = B, = B’. 


THEOREM 2’. Let (QM, M, 7) be a regular pairing with nucleus b. The map- 
pings t — rN’, R’ — r(R’, M) between the sets of right ideal direct components 
of b and the C*-direct summands of IN’ are inverses of each other, and possess the 
properties stated in Theorem 2. 


THEOREM 3 (Weyl).® Let (M’, M, +r) be a pairing of M’ K M— B, where B 
is a semi-simple Q-ring satisfying the minimum condition for left ideals. Then 
the pairing is regular. The mappings |— Mi and R — r(M’, R) are inverses 
of each other, and establish a (1-1) inclusion preserving correspondence between 
the set of all left ideals of 8 which are contained in the nucleus b, and the set of all 
€-submodules of M. If 


li - Ri = ML, Leo Re = Mle, 


‘This resuit, and Theorem 2 in its essentials, have been proved by Weyl for pairings of the 
type considered in §3 (16, Chap. 5; 15; 17, Chap. 3). 





' 
| 
| 


COMMUTING RINGS OF ENDOMORPHISMS 285 


then 
i+h- Ri + Re, Oko Ri TO Ra, 


and |, and |; are B-isomorphic if and only if R, and Rz are C-isomorphic. 


Proof. The structure theory of semi-simple rings implies that the pairing 
is regular, and that every left ideal in b is a direct component of 6. By Theorem 
2, [— Mil is a (1-1) inclusion preserving correspondence between the set of 
all left ideal direct components of 6 and the set of all €-submodules of M. 
By a principle of lattice theory, the mapping preserves the lattice operations. 
That it preserves isomorphism relations has been proved in Theorem 2. 


Example. Let b — U(’) be an ordinary representation of the group algebra 
% of a finite group @ by Lt. in a finite dimensional vector space J? over a 
field, and let € be the set of all l.t. commuting with the 1.t. U(d),d € B. 
Let b, and 6, be the nuclei of the pairings constructed in 2 and 3 respectively. 
Finally let us assume that both pairings are regular. Then by Theorems 2 and 
2’, a left ideal Be of b, is matched against the €-submodule MU (e) of M, 
while a right ideal f% of b, generated by an idempotent f is matched against 
the €-submodule IU(f7). We remark finally that b, = 6,7’. 


6. Maximal submodules of indecomposable €-direct summands. 
We adhere to the assumptions and notation of §5, and make the additional 
assumption that % satisfies the minimum condition for left ideals, and hence 
also the maximum condition, since 8 has an identity element. Let R be the 
radical of 8; then every indecomposable left ideal direct component Be of B 
has a unique maximal subideal Ne. Every proper subideal of Be is nilpotent, 
and Se and Se’ are B-isomorphic if and only if Be/Ne and Be'/Ne’ are 
%-isomorphic (1, Chap. IX). 


LemMA 4. Let R = Me be an indecomposable C-direct summand of M. 
Then ® has a unique maximal €-submodule S, and 


(23) 7r(M’,S) SNe, M(Me) CS. 


Proof. By the Corollary to Theorem 2, Be = 7(M’, R) is an indecomposable 
left ideal. Let S = >> R,, where {R,} is the set of all proper €-submodules of 


R. By (21) and the fact that % satisfies the maximum condition for left ideals, 
we have 


r(M’, S) = } r(M’, R,), 
which in turn can be expressed as a finite sum 
| Dis 1M’, R,,). 
No 7r(M’, R,) = Be, otherwise, by Lemma 2, 
R, = Mr(M’, R,) = Me = R. 











286 Cc. W. CURTIS 


Hence each r(Q’, R,) is nilpotent, and since the sum is finite, r(M’, S) is 
nilpotent. This proves (i) S ~ ® (for if S = ® then r(M’, S) contains an 
idempotent ~0) and (ii), r(M’, S) C Ne. For the other inclusion of (23) it is 
sufficient to prove that Pt(Ne) ~ RM. If, however, M(Ne) = RM, then by 
Lemma 3 and (22) we have 


e = r(M’, R) = r(M’, MNNe C M, | 


contrary to our assumption that e? = e # 0. This completes the proof. } 


i 


THEOREM 4. Let R: and R. be indecomposable C-direct summands of M ) 
with maximal €-submodules S, and Ss. Then Ri/S, and R2/S:2 are C-isomorphic 
if and only if Ri and R: are C-isomorphic. 


Proof. We prove the result by throwing the argument back to the known 
results concerning the ideals in 8. Using Lemma 4, it is easy to prove that the 
€-isomorphism of ®, onto R, induces a C-isomorphism of R:/S,; onto R2/Se. 
For the proof of the converse it is enough to show, by Theorem 2, that 
Be, = r(M’, Ri) and Be. = r(M’, R2) are B-isomorphic. This we prove by 
showing that Be:/Ne: and Be2/Ne. are B-isomorphic, assuming that R/S; 
and ®:/S: are €-isomorphic. 

Let ¢ be a G-isomorphism of 1/S; onto M:/Se, and let @ = ¢—'. In both 
Ri and Re iin a fixed system of representatives of the cosets in 9,/G, and \ 
R/S. respectively, and for each x, € Mx, let xf be the representative of the 
coset (x: + S,)f; that is 

wif + S. = (x1 + Sif. 
Similarly we define a map 6 of ®, into Ri. We have 
(24) x, = x1f6 (mod S;), x2 = x8 (mod S,), 
for all x; € Mi, x2 € Re. 

Now define a mapping u of r(M’, M1) into r(M’, Rz), namely 

Bw: Dora X11) > Drs, x1), 
where x1; € 91, ¥: € M’ for all i. We contend that the induced mapping 


B: ers, xis) + Near > Drs, xi) + Nes { 


is a B-isomorphism of Be,/Ne; onto Bes/ Nes. 
First we prove that Z is single valued. Let a = }or(W;, x1,) € Nes; then 
Ma CK M(Nei) [ Si by (23). Thus for all u € M, 


ua +S = Vxul¥iOu) + Gi = 
Applying ¢ we have } (x1, + Sit (vy; © u) = O. Then 
X xii © u) € Se, CMr (Hs, xf) C Se. 


If eo = >or(y*,, x*,) is the identity element in 6, then a = ea implies 
J p 


: 
: 
| 
| 





COMMUTING RINGS OF ENDOMORPHISMS 287 


au = eo(au) € r(M’, Sz) f Nes 


by (23), and Z is single valued. That g is a 8-homomorphism follows from the 
bilinearity of r, and the ontoness from (24) and (23). To prove that ¢ is (1-1), 
let 


“Xr(ws, xi) € Nes, ¥. € D, Xiy € Ri. 


Then by (23), © Mr(., x16) CS Se; as in the first part of the proof we now 
verify that }r(¥,, x+1,€8) € Nei, and that 


Lr (Ws, X14) = “Ur: x1 £6) = Ler(Ws, x1 a %1 £8) E r(M’, S:) C Ne, 


by (23). Thus }or(W;, x14) € Nes, and we have proved that @ is (1-1). This 
completes the proof of the theorem. 


7. The structure of c = 2’ © M. Let (M’, Mt, r) be an abstract pairing. 
We shall assume that the nucleus ¢ = Qt’ © M of the associated pairing 
(M’, M, ©) contains the identity mapping on M, and that the function y © u 
is non-degenerate. Since ¢ is a right ideal in ©, the first assumption implies 
that c = ©, and the two assumptions combined imply that the dual pairing 
(M’, M, ©) is regular in the sense of §5. Since the ring B of all c-endomor- 
phisms of J is a centralizer of Pt relative to c, the methods of §5 yield a 
correspondence between the $-direct summands of IM and the left ideal direct 
components of ¢: to a B-direct summand ® corresponds 


MW OR= {vi O us| vs © M, u,€ Ri}, 


while to a left ideal direct component [ of ¢ corresponds the $-direct summand 
Mt. 


THEOREM 5. Let (M’, M, ©) be a regular pairing of M’ K M — ¢, which is 
dual to an abstract pairing (M’, M, +). Then the mappings R — M’ © R and 
1 —> Ml between the set of B-direct summands of M and the set of left ideal direct 
components of ¢ are inverses of each other. These mappings preserve direct sums 
and intersections whenever all modules concerned are direct summands. Two 
B-direct summands R, and NR. are B-isomorphic if and only if D’ © R, and 
Pe’ © Re are c-isomorphic. R is an indecomposable G-direct summand of M if 
and only if Mt’ © R is an indecomposable left ideal in c. 


Proof. The first part of the theorem follows from Theorem 2, if we observe 
that a B-direct summand of M is necessarily a B-direct summand. By Theorem 
2, a c-isomorphism between Pt’ © R; and M’ © Mz induces a B-isomorphism 
between 9; and M2, and hence ®; and RK, are B-isomorphic, since B, C B. 
Now let x — x” be a 8-isomorphism between 8; and ®.. We supply the first 
step in the proof that 

Lv: Ox1— Lv: Ox? 


is a ¢-isomorphism of DP?’ © RM; onto MY’ © Me. Let Ly, © x, = 0; then 
ym xt (Wi, M) - 0. 











288 Cc. W. CURTIS 


Since hk is a B-isomorphism and r(y,, Mt) C B, we have } x,’*r(¥i, M) = O. 
Then >-y¥, © x; = 0, and the mapping is single valued. The rest of the proof 
is left to the reader. The final statement of the Theorem follows from the proof 
of the Corollary to Theorem 2. 


Dually, we may state the following result. 


THEOREM 5’. Let (M’, M, ©) be a regular pairing, as in Theorem 5. Then 
the mappings R' — KR’ © M and r—> rM' between the B-direct summands of 
M’ and the right ideal direct components of ¢ have the properties stated in Theorem 5. 

' We shall omit the proof of Theorem 5’. 


8. Further results on the structure of c = Pt/ © WM. Using the results of 
§4, we shall establish a further theorem on the structure of the ringe = POM, 
in case the pairing (M’, M, r) is constructed from a projective representation 
of a finite group according to §2. In this case 8 = A(G, H, p) is a crossed 
product, and the set 2 is vacuous. We shall assume that the dual pairing 
(M’, M, O) is regular, so that the results of §7 are available. 


THEOREM 7. Let (M’, M, 1) be a regular pairing of M’ XM— B = A(G, 
H, p) as defined in §2. Let the dual pairing (M’, M, CO) be regular. Then every 
indecomposable left or right ideal direct component of ¢c = IM’ © M contains a 
unique minimal subideal. 


Proof. First let | be an indecomposable left ideal direct component of ¢. 
By Theorem 5,1 = 2?’ © R, where MR is an indecomposable B-direct summand 
of M. Our assumption that the pairing (M’, M, ©) is regular implies that M 
is a projective B-module, by Proposition 5 and the first remark thereafter. 
Therefore R is B-isomorphic to an indecomposable right ideal direct compo- 
nent of %, and by Proposition 4, it follows that R contains a unique minimal 
B-submodule m #0. Since the pairing (M’,M, ©) is non-degenerate, 
M’ © m ~ 0. Now let I’ + 0 be any left ideal contained in I. The fact that 
the pairing (M’, M, 7) is regular implies that Ml’ ~ 0. By (20) we have 

PDP OM’ DM’ O m, 
and we have proved that 2?’ © m is the unique minimal subideal of I. 

Now let rt be an indecomposable right ideal direct component of ¢. By 
Theorem 5’,r = R’ © M, where RK’ is an indecomposable B-direct summand 
of 9’. By the second remark following Proposition 5, Pv’ is a projective 
$-module, and the argument given in the first part of the proof can be applied 
to prove that r has a unique minimal subideal, as required. 


Coro.iary. Let A be a field, and let E be the subfield of A consisting of those 
elements of & left fixed by the automorphisms 3, s € G. Let the hypotheses of 
Theorem 6 be satisfied, and assume also that IQ is finite dimensional over A. 
Then ¢ = Tt’ © M is a QF-2 algebra® over the field E. 


*A finite-dimensional algebra & over a field E is a QF-2 algebra (14) if every right or left 
ideal direct component of & contains a unique minimal subideal. 





f 
df 


ee | 


—— ee 


COMMUTING RINGS OF ENDOMORPHISMS 289 


Proof. It suffices to prove that ¢ is finite dimensional over E. Since A is 
commutative, the automorphisms &, s € @, form a finite group, and from 
Galois theory it follows that A is a finite extension of E. Therefore 9 is finite 
dimensional over E. The elements of ¢ are I.t. in It over E, and ¢ contains the 
scalar multiplications by elements of E, so that ¢ is a finite dimensional algebra 
over E, and the Corollary is proved. 

Thraii’s paper (14) contains a number of results concerning QF-2 algebras, 
all of which are directly applicable to c. We refer the reader to that paper for 
the details. 

We add a final remark on the application of the theory to projective repre- 
sentations of groups. Let (Q?’, Mt, 7) be a regular pairing of M’ K M— B, 
constructed as in §2, and let € be a centralizer of M relative to B. Then 
Proposition 4, and the results of §5, can be applied to prove that every inde- 
composable €-direct summand of Jt contains a unique minimal submodule. 
The proof is similar to the proof of Theorem 6, and will be omitted. 


9. Applications to the Galois theory of primitive rings with minimal 
ideals. Let Dt’ and M be left and right, respectively, vector spaces over a 
division ring A, which are dual relative to a non-degenerate bilinear form 
(y, x) on DM’ K M— A. Let L(M?’, M) be the set of |.t. A on M over A which 
possess transposes relative to the form (y, x), and let §(M?’, Mt) be the subset 
of 2(M’, Mt) consisting of finite valued |.t. We shall consider a ring W of L.t. 
in M over A such that (7, 8) 


(25) 5M’, M) C AC L(M’, M), 


together with a finite group @ of automorphisms A — A‘ of WY. Then @ is a 
primitive ring with minimal ideals, and conversely, every primitive ring with 


minimal ideals is isomorphic to a dense ring of |.t. which satisfies (25). Let € 
be the set of elements of & which are left fixed by all the elements of G. We 
shall indicate how € may be regarded as a centralizer of I? relative to a crossed 
product A(G, H, p), so that the results of §5-8 can be applied to discuss, for 
example, the subspaces of J? which are invariant relative to €. 

For each element s in G, there exists a (1-1) s.l.t. U, with associated auto- 
morphism & of J? onto itself, which possesses a transpose relative to the form, 
and which satisfies the equation 
(26) A‘ = U;"AU, 
for all A € &. Since (A*)' = A**, we obtain from (26), 

U;" U;"A U,U, = U,;"A Us, 
and 
AU,U,U, >" = U,U,.U,;"A, 


for all A. Since & is a dense ring of I.t., for each pair (s, ¢) there exists a scalar 
multiplication 


zi-1 
Pst 











290 Cc. W. CURTIS 


such that 

UUU. _ des 
or 
(27) UU, = Usps, 


It is now easy to verify that {p,,,; §} is a factor set, and that if 8 = A(G, H, p) 
is the corresponding crossed product, then the mappings U, define a representa- 
tion of 8 by endomorphisms of J. Since (26) is unchanged if we replace U, 
by U,u,, we may assume that p;,, = 1. Then the condition 


Pse-t = 1 


of Proposition 1 is satisfied if and only if U,U,-1 = U; for all s in G. We have 
to show finally that € satisfies (19). The elements of € are 8-endomorphisms 
of I. On the other hand, by (15) it follows that Dt’ © M is precisely the set 
of |.t. >, A*, where A ranges throughout §(M’, M) C A, sothat M’ O MC GC, 
and (19) is proved. 


10. On the centralizer of a projective module. It seems probable 
that more penetrating results than we have obtained in §7 and 8 can be 
proved concerning the structure of the centralizer of a projective module. 
To support this view we shall prove the following result. 


THEOREM 7. Let U be a commutative symmetric algebra of l.t. on a finite 
dimensional space IN over a field ® such that M is a unital projective (right) 
W-module. Then the centralizer © of M relative to A is a symmetric algebra. 


Proof. The only consequences which we shall require of the assumption 
that M is a projective W-module are the following: (a) the indecomposable 
direct summands of the %-module Pt are Y-isomorphic to indecomposable 
right ideal direct components of & (9, Theorem 1); and (b) if M = Ms © Me, 
where JN; and Mt, are A-modules, then Pz, and WY. are projective A-modules 
(5, p. 473). 

We recall that & is symmetric if and only if there exists a hyperplane 
u(a) = 0, which contains all commutators ab — ba but no non-zero right or 
left ideals. We shall require the result that if Y% and S are symmetric algebras, 
then the Kronecker product &% @ B is symmetric. 

Now we begin the proof of the theorem. First assume that A = A, © As, 
where %,; and W%, are non-zero ideals. If we set M@, = MUA, (¢ = 1, 2), then each 
M, is a faithful A-module, and M = Mt, © Me. The elements of the central- 
izer ©, of Mt, relative to A, (¢ = 1,2,) may be viewed as elements of the 
centralizer € of MM relative to A, and with this agreement, € = €; @ G. 
It follows that € is symmetric if we can prove that the €, are symmetric. 
Furthermore, each I, is a projective Y-module, and hence a projective 
W%.-module (i = 1, 2). Thus we may assume, without loss of generality, that, 





“- 


COMMUTING RINGS OF ENDOMORPHISMS 291 


in addition to the hypotheses stated in the theorem, & is an indecomposable 
algebra. Now let 


M=MiS...9M,, 


where the I2, are indecomposable %-modules. Since M is projective, each M, is 
Y-isomorphic to A, by the indecomposability of A, and hence the PM, are 
isomorphic to each other. Evidently I, is a faithful cyclic W-module. Since W 
is commutative, the centralizer of I, relative to UW is isomorphic to AY. The 
centralizer € of IM is isomorphic to the full algebra of s by s matrices with 
coefficients in the centralizer of I, (6, p. 58), and hence 


Cc => (A), =A @ F,. 


Since both & and ®, are symmetric algebras, we conclude that € is symmetric, 
and the theorem is proved. 


11. Examples of regular pairings. We shall consider the pairing 
o of §3, which has been studied by Weyl in connection with the representation 
theory of the full linear group. Let ® be an arbitrary field of characteristic 
pb > 0. Let M be the m-fold Kronecker product with itself of an n-dimensional 
space ¥ over &. Let G = G,, be the symmetric group on m letters, and let 
b — U(b) be the (ordinary) representation of the group algebra 8 of G by 
symmetry operators on J. Let (M, M*, «) be the pairing defined in §3, and 
let b be the nucleus o(M, M*). We shall state without proof a few special 
results. 

(a) p > m or p = O, » arbitrary. Then % is semi-simple, and the pairing is 
regular. The centrally primitive idempotents of 8 which are contained in 6 
have been determined explicitly by Weyl (17, Chap. IV). 

(b) m <n, p arbitrary. Then 6 = %, and the pairing is regular. 

(c) m = 3, p = 3, n = 2. Then 6 = &, and the pairing is regular. In this 
case the kernel & of the representation U is different from zero, and 6 (\ R =&. 


REFERENCES 


1. E. Artin, C. J. Nesbitt, and R. M. Thrall, Rings with minimum condition, University of 
Michigan Publications in Mathematics, No. 1, 1944. 

2. R. Brauer, On sets of matrices with coefficients in a division ring, Trans. Amer. Math. Soc., 
49 (1951), 502-548. 

3. H. Fitting, Die Theorie der Automorphismenringe Abelscher Gruppen und ihr Analogen bei 
nicht kommutativen Gruppen, Math. Ann., 107 (1932), 514-542. 

4. W. Gaschiitz, Uber der Fundamentalsatz von Maschke zur Darstellungstheorie der endlichen 
Gruppen, Math. Z., 56 (1952), 376-387. 

5. F. Kasch, Grundlagen einer Theorie der Frobeniuserweiterungen, Math. Ann., 127 (1954), 
453-474. ; 

6. N. Jacobson, The theory of rings, Mathematical Surveys, II (New York, 1943). 





7. , The radical and semi-simplicity for arbitrary rings, Amer. J. Math., 67 (1945), 
300-320. 
8. ———, Lectures in abstract algebra, 11 (New York, 1953). 











292 Cc. W. CURTIS 


9. H. Nagao and T. Nakayama, On the structure of (M.) and (Mx) modules, Math. Zeit., 59 
(1953), 164-170. 

10. T. Nakayama, On Frobeniusean algebras, 1, Ann. Math., 40 (1939), 611-633. 

, On Frobeniusean algebras 11, Ann. Math., 42 (1941), 1-21. 

12. C, Nesbitt, On the regular representations of algebras, Ann. Math., 39 (1938), 634-658. 

13. C. Nesbitt and R. Thrall, Some ring theorems with applications to modular representations, 
Ann, Math., 47 (1946), 551-567. 

14. R. Thrall, Some generalizations of quasi-Frobenius algebras, Trans. Amer. Math. Soc., 64 
(1948), 173-183. 

15. H. Weyl, Commutator algebra of a finite group of collineations, Duke Math. J., 3 (1937), 
200-212. 

, The theory of groups and quantum mechanics (New York, 1931). 

, The classical groups (Princeton, 1939). 





16. 
17. 








University of Wisconsin 


| 
> 
{ 
| 





THE COVERING OF SPACE BY SPHERES 


E. S. BARNES 


1. Introduction. Bambah (1) has recently determined the most econo- 
mical covering of three dimensional space by equal spheres whose centres 
form a lattice, the density of this covering being 


_ 55 
(1.1) =F. 


As is well known, this problem may be interpreted in terms of the inhomo- 
geneous minimum of a positive definite quadratic form. If f(x) = f(x, 
X2,... ,Xn) (m > 2) is a positive quadratic form of determinant D, then, for 
any real a = (a, a2,...,a@,), we define m(f; a) to be the minimum of f(x+a) 
for integral x. The inhomogeneous minimum of f(x) is then defined as 


m(f) = max m(f;a). 


If now #, is the density of the most economical covering of n-dimensional 
space by lattice-ordered spheres, we have 


2/n 
(2) = a me ’ 


where J, is the volume of the unit sphere: 


at+et... $28 <1. 


Thus (1.1) is equivalent to the assertion that 


for all f(x, x2, x3), and that the equality sign holds for some form f. 

It is natural to introduce here the notion of an extreme form, by analogy 
with the corresponding homogeneous problem. We shall say that f(x) is extreme 
if the ratio m(f)/D'” is a (local) minimum, i.e. is not increased by any suffi- 
ciently small variatio": of the coefficients of f. Forms for which m(f)/D'” is an 
absolute minimum may be called absolutely extreme. Since m(f) and D are 
invariant under equivalence transformations (integral unimodular transforma- 
tions of x1,...,%,), while m(f)/D'” is unaltered by multiplying f by an 
arbitrary positive constant, the property of being extreme is shared by the 
class of forms consisting of all forms equivalent to a multiple of some one form 
of the class. 





Received September 13, 1955. 











294 E. S. BARNES 


I prove here: 


THEOREM 1. If nm = 3, there is just one class of extreme forms represented 
by 


(1.3) fo(x1, X2, Xs) = Bx + 3x3 + 3x5 — eux. — Qeixs — Qrows; 
and for this class 

125 ..\* 
(1.4) m(f) = (338 ) é 


This theorem clearly includes the results of Bambah (1) (where the question 
of the existence of other classes of extreme forms is left open). 

The object of this paper is, however, not so much to establish the above 
refinement of Bambah’s results as to give a much simpler proof, which also 
suggests a method of attacking the problem when n > 4. 

The starting point of the proof is Voronoi’s method of reduction of a 
positive form f and the construction of the polyhedron II associated with f. 
These are discussed in §2. Theorem 1 is proved in §3, while §4 contains some 
remarks on the method and the possibility of extending it to higher dimensions. 


2. Reduced forms and their polyhedra. Voronoi (3, p. 150) has shown 
that every class of equivalent positive forms in 3 variables contains a form 
expressible as 


(2.1) f(x1, x2,x) = porxi + posx2 + poses + pi2(x1 — x2)” + pis(x1 — xs)” 
+ pos (Xe -_ x3)" 
where pPiy > O (1,7 = 0,...,3); 


and clearly the p,; are uniquely determined by f. We call such a form reduced 
(in the sense of Voronoi). 

The p,; are not in general determined by the class of f. We have in fact, 
defining for convenience 


Pig = Pyis i>), 
LEMMA 2.1. If p, q,7, 5 is an arbitrary permutation of 0, 1, 2, 3, then the form 
(2.2)  pygXi + Pyrk2 + ppers + per(x1 — X2)” + poser — X32)” + prs(x2 — x2)" 
is equivalent to the form (2.1). 
Proof. The result is obvious if p = 0, since then (2.2) arises from (2.1) by 


the transformation x, — x;, x, — X2, x, — x3. It therefore suffices to prove the 


result for p, g, r, s = 1, 0, 2, 3; this however corresponds to transforming 
(2.1) by 


X17 X11, X27 X1 — Xo, X37 X — Xz. 


This Lemma is the genesis of the suffix notation in (2.1), and provides an 
“argument by symmetry” which will be frequently used in what follows. 


——_— = 





—— ee, 


THE COVERING OF SPACE BY SPHERES 295 


The set of points of space which are at least as near to the origin as to any 
integral point / (with the metric defined by f) forms a closed bounded convex 
polyhedron II, the intersection of the half-spaces 


f(x) < fe — 9, 


where / runs through all integral points. II may in fact be defined by a finite 
number 20 < 2(2* — 1) of these inequalities of the type 


f(x) Sf + k) (k = 1,...,¢). 


The planes f(x) = f(x + /,) are then the faces of II. 

Perhaps the simplest method of obtaining J,,...,/, is to use the criterion 
established by Voronoi (4, p. 277): a point /(#0) appears in the set 
+1,,..., +l, if and only if the minimum of f(x) over x = / (mod 2) is attained 
only for x = +1. 

It is clear that, for the form (2.1), the minimum of f(x) for prescribed 
parities of x, x2, x3 is attained when the even x, are zero and the odd x, are all 
1 or all —1; and in general (e.g., if all p,, > 0) only for these two sets. Thus, 
in general, II has 7 pairs of parallel faces, for which we can find a symmetrical 
notation as follows: 


Define xp = 0, so that 
3 


f= pm Pis(Xi—- x,), 


0 


and set 
y= 3g ¥ pulses — 2) Gi =0,...,3); 
then the 14 faces of II are given by 
F;: 2y% _ ys Pits 
isi 
(2.3) F,;: 2(y: + ys) = > (pa t+ Pyt)s 
J 


F iy: 2(y: + 9,+ Ve) = P (ou + pit Pri), 
ah 


where all indices and summations run from 0 to 3. Since clearly > y, = 0, 
the faces F;, Fy, and the faces F;,,, F,, are parallel, where i, 7, k, / is any 
permutation of 0, 1, 2, 3. 
It is easy to verify the faces 
Fy: 291 = 2porx1 + 2pie(x1 — X2) + 2pis(x1 — Xs) = por + pi2 + pis, 
(2.4) Fis: 2(y1 + y2) = por: + Zposre + Zois(x1 — x3) + Zpoa(x2 — x) 
= poi + por + pis + pes, 
Fy23: 2(y1 + ¥2 + Ys) = ZporXs + Zpoxr2 + Zposxs = por + por + pos, 











296 E. S. BARNES 


determine a vertex v2; of Il; thus for example we have 
[2(yo + ¥1 + ¥s)| = [2ye] = |po2 — pr2 + pes] < por + pie + prs. 


Applying all 4! permutations of the suffixes 0, 1, 2, 3, we obtain 4! distinct 
sets (F,, F;;, Fix) of faces determining 4! vertices v,. Since II has at most 4! 
vertices (4, p. 205), we have therefore determined all vertices of II. 

Our next task is to determine m(f). From the definition of II it is clear 
that 


m(f) = max f(x); 


and by the convexity of II and of the ellipsoid f(x) < m(f), it follows that 
m(f) = max f(v) 
over all vertices v of II. 

To calculate the values of f(v), it suffices to evaluate f(v123) and then to apply 
all permutations of suffixes in the p,,; and the evaluation of f(v123;) may be 
simplified by observing that 

f(x) = xiyi + Xay2 + Xays. 
A direct calculation gives 
(2.5) 4Df (0123) - D (p01 + por + pos + pi2 + pis + p23) — K - 4po1p03p12p23 
where D is the determinant of f (and of the equations (2.9)) and! 


K => porpo2po3(p12 + pis + ps). 


Since D, }p:; and K are invariant under permutation of suffixes of the 
px, it follows from (2.5) that f(v) has at most 3 distinct values for vertices 
v of II. Denoting these by f;, fe, fs and setting 


(2.6) A1 = popes, Az = Po2pi3s, As = Pospi2, 
we have 
4Df, = D(X pi;) — K — 4aad; 
(2.7) 4Df. = D(Spi;) — K — 4a; 
4Df; = D(Xpis) — K — 4A1d2. 
Since D(fs — fs) = es — AQ) 
for i, 7, k a permutation of 1, 2, 3, the value of 
(2.8) m(f) = max(/1, fe, fs) 


is easily decided from the relative magnitudes of \,, As, As. 
The above analysis has been carried out on the assumption that II has 14 


tWe use here the usual summation convention for symmetric functions, so that K is the 
sum of the four distinct terms obtainable by cyclic permutations of 0, 1, 2, 3. 


_ 


} 





THE COVERING OF SPACE BY SPHERES 297 


faces. If some of the p,, vanish, some of the planes (2.4) are linearly dependent 
on the others and may be discarded. The effect of this is that certain of the 24 
vertices coincide; thus if pi2 = pis = p23 = 0, Il degenerates to a parallele- 
piped, and f(v) is the same for each of its 8 vertices. Such degeneration, how- 
ever, does not affect the validity of our final results (2.7), (2.8). 

It is convenient to note here, before proceeding to the proof of Theorem 1, 
some formulae concerning D, K and their derivatives. 


We have 
D= por + pie + pis — Piz — pis 
— Piz por + piz + pes — p23 
— pis — prs pos + pis + pos | 
(2.9) = D porpo2p0s + D porpes(po2 + pos + pis + pis) 


and, writing for convenience 

(2.10) Fs = PyePjyi + PyePri + Pye 
(where i, j, k, / is any permutation of 0, 1, 2, 3), 
(2.11) oe = on + 01+ ds + ds 
Using symmetry, we obtain 


(2.12) se — oO = oy — 01 — + ds 
(2.13) S42. @.. - 2, - Ww. 


Opo1 Opes Opor Opis 
Similarly we find 


(2.14) Oper aaa Po2pos(pi2 + pis + p23) + 12013 (por + pos + pes) 


+ po2pi2p2s + pospispes 


(2.15) 2— — =— = (Az — Ar) (03 + pi2) — As(01 — por + pes — pis) 
Opo1 Opis 
— (pos + p12) (poip02 — Pispes); 
interchanging 1 ong 2 and subtracting gives 
aK 0K 0K 


(2.16) ra + * at a hy 2(A2 — Ax) (p03 + pis) 


— 23(p01 — por + p23 — pis). 


3. Proof of Theorem 1, We take f in the form (2.1), and suppose that 
f is extreme. We prove successively: (i) the two greater of \;, Az, A; must be 
equal; (ii) A: = Az = Az; (iii) all p,, are equal. In each case the proof proceeds 
by exhibiting a variation of the coefficients p,, which, if the stated conditions 











298 E. S. BARNES 


are not satisfied, contradicts our supposition that f is extreme. It will always 
suffice to work to the first order of small quantities; we denote generally by 
6R the first order variation in a function R of the p,, resulting from small 
variations 5p;,. 

In order to apply the analysis of §2 to both f and the neighbouring form 
f =X (ois + p13) (x1 — x;)*, we must of course ensure that p,,; + dp,; > 0 
for all i, 7. If all p,; > 0, this will obviously hold for sufficiently small dp,, of 
either sign. If our hypotheses do not allow us to infer that 5p,, # 0 for some 
i, j we shall always choose the corresponding dp;,; > 0. 


Lemma 3.1. If f is extreme, it is impossible that 
(3.1) Ai > Ae > As 
Proof. If (3.1) holds, we have m(f) = fi, by (2.7), (2.8); and we shall have 
m(f’) = f’, for any sufficiently near form f’. 
We choose 
5por = 5p2s = — €, Spor = Spis = € (e > 0), 
noting that (3.1) implies that po, > 0, p23 > 0. Then, by (2.13), 


(3.2) wa - fy 2 P _ @) WL ay, - 1 >0 
Opo1 Op23 Opoz Opis 


We set 
(3.3) L= D (p03 + Piz) — K — 4dadz, 
so that, by (2.7), 
(3.4) 4f; = por + por + pis + pos + L/D. 


Using (3.2) and (2.16) we find easily that 
6L = (pos + Piz) 6D = 5K <—_ 45 (A2A3) 
- 2€A3(po1 + Poz + P23 + pis), 


whence 

(3.5) 6bL < 0. 
We have also 

(3.6) L>O. 


This may be verified by direct computation, using (3.3) and (3.1). We may 
argue more simply as follows: 

Since f(x) > f(1, 1,0) = por + por + pis + p23 for x1, x2, x3 = 1, 1,0 (mod 2), 
we have f(x) > 3(p01 + por + pis + p23) for x1, x2, x3 = 3, 4,0 (mod 1); 
hence m(f) > 3(p01 + poz + pis + p23). Since f; = m(f), (3.6) follows at once 
from (3.4). 

We have thus shown that 


6D > 0, if: < 0, 


J 





re 


THE COVERING OF SPACE BY SPHERES 


whence, for all sufficiently small « > 0, 


m(f’)D’* = f',D’* < f,D> = m(fyD". 


This contradicts our assumption that f is extreme. 


LemMA 3.2. If f is extreme, it is impossible that 
(3.7) Ai = Ae > Az. 


Proof. If (3.7) holds, we have m(f) = f; = fz > fs; and, for any sufficiently 


near form f’, m(f’) = max (f’:, f’:). We choose 


5po. = bp28 = — €1, Spor = dpis = — €2, 


5pos = Spies = €1 + €2, 


where ¢, > 0, ¢: > 0 (noting that (3.7) implies that po:, p23, po2, pis are all 


positive). By restricting ¢:, €: to satisfy 


€1(po1 + p23) = €2(po2 + pes), 
we ensure that 








5(A1 — As) = 0. 
By (2.13) and (3.7), and writing for convenience 
A= Ai = Ag, 
we have 
0D aD aD aD aD oD 
ie a( 22 Opi2 " por es 2p.) af 22 + Opis 
= 2ei (Ax st As) + 2e2(A2 — As) 
= 2(e: + €)(A — As) > 0. 
We set 
M = (pi2 + pis + p2s)D — K — 4r2adsz, 
so that 


4f, = (por + poz + pos) + M/D. 


Arguing as in Lemma 3.1, we have 


fi = m(f) > f(4, 3, 4) = 4(o01 + poz + pos), 


whence 
M > 0. 


Also, using (2.16) (with suitable permutations of the suffixes), we obtain 


6M = (pie + Piz + p23)6D = 6K —_ 45(A2A3) 
== (€, + €2)[(A — Xz) (p01 + Por) + pos + Aspr2] 


< 0, 
since } > As, por + por > 0. 








300 E. S. BARNES 


Since 6D > 0, 5(po1 + Po2 + Pos) = 0 and 56M « 0, we see that bf, < 0; 
and by symmetry df; < 0. Hence for all sufficiently small «:, «2: we have 


D’ > D, m(f’) < m(f), 
contradicting our assumption that f is extreme. 


Lemma 3.3. If f is extreme, then 
(3.8) Ay = Ae = Az. 


Proof. By asuitable permutation of suffixes we can ensure that A; > A» DAs; 
the result now follows from Lemmas 3.1, 3.2. 


Lemma 3.4. If f is extreme, it is impossible that 
(3.9) Poi > P13, Por > p23. 

Proof. Suppose that (3.9) holds. By Lemma 3.3, (3.8) holds and 
m(f) = fi = fe = fs. 


We make the variation 


—dpo. = Spis = €1 = €(p01 + pis), 
— Spor = Sp2s = €2 = €(po2 + p23), 
dpos = —€s = —€(po1 + por — pis — prs), 
dpi2 = O, 


where « > 0. To justify this, we have to show that po; > 0, por > 0, pos > 0. 
Clearly po: > 0, por > 0, by (3.9). If now pos = 0, then As; = O, whence A; = Az 


= 0 by (3.8); this gives pe; = 0, pis = 0, since po. ~ 0, por = 0. But now 
= port; + poxrt? + pi2(x1 — x2)? and is clearly not positive definite. 
It is easy to see that, for all sufficiently small «, we have \’; = ‘2 > X's, 
so that the neighbouring form f’ has 
m(f’) =f’: (=f'2). 
For 


X's — A’2 = (por — €1) (p23 + €2) — (por — €2) (prs + 1) 


= Ai — Az — €1(po2 + p23) + €2(p01 + pis) = 0; 


and 


5A1 = poré2 — pros€i = €(po1p02 — pisp2s) > 0, 
bA3 — €spir2 < 0, 
so that Ni > d's. 


We now obtain a contradiction to the fact that f is extreme by showing 
that 


(3.10) sD =0, of: < 0. 








As; 


Ow 


ing 





THE COVERING OF SPACE BY SPHERES 301 


By (2.12) and (2.11) we have 





—>— = oo — 03 —~ A + Az 


(por + piz + p2s) (p13 + p22 — por — por) + por — prs» 
and, by symmetry, 
aD _ aD 
Opo2 Op23 


- A 22 - 2) - A - 22) 
"\Apor Opis *\Apor Ops 


= €(po1 + por — pis — p2s)[(p01 + p13) (p02 + pie + pos) 


= (por + pu + P13) (pis + a 7. Por) + por = pis- 


+ (por + p23)(p01 + piz + pis)] 
— €(p01 + p13) (p02 — p2s) — €(p02 + p23) (p01 — pis) 
= €3[(p01 + pis) (p02 + pi2 + p23) + (por + p23) (p01 + pi2 + pis) 
— (p01 + prs) (p02 + p2)] 
= €3(¢9 + os + Ai + As) 
D 


Po3 : 


= €3 


from which it follows immediately that 6D = 0. 
Writing, as in Lemma 3.1, 


L= (pos + pun)D—K — 4rd; 
we have, using 6D = 0, 
6L = Dépos = 6K = 45(Acd3); 
and a calculation similar to the above, using (2.14), (2.15) and (3.8), gives 
5L = — 2espoa[ (p01 + pi2 + pis) (por + pre + p23) — pis] < 0, 
since €; > 0, pos > 0. As in Lemma 3.1 we deduce that df, < 0. 
This establishes (3.10), and the Lemma is proved. 
Lemma 3.5. If f is extreme, then 
(3.11) At = Ae = Az > 0. 
Proof. By Lemma 3.3, it suffices to prove the impossibility of 


Now if (3.12) holds, at least three p,, are zero. Since in any three p,, some 
suffix occurs at least twice, we may assume by symmetry that 


(3.13) pis = P23 = 0. 








302 E. S. BARNES 


Since A; = pospi2 = O and pos + 0 (else f does not involve x; and so is not 
definite) we have p:. = 0. Thus 


f = pox + posxs + posxi, 
and, since f is definite, we have 
(3.14) pou > 0, por > 0. 
Now (3.13) and (3.14) contradict Lemma 3.4. 


Lemma 3.6. If f is extreme, then 
(3.15) Pol = poz = Pos = Piz = pis = P23- 
Proof. We first show that po: = pis. 
If po: ¥ pis, then, after interchanging 0 and 3 if necessary, we have 
Poi > Piz. 
Since by Lemma 3.5 
Ar = poip2s = porpis = Az > O, 
we have also 
poz > p23- 


By Lemma 3.4, these inequalities cannot hold. 
Thus po: = pis. By symmetry we have 


Pits = Px 


for any distinct suffixes i, 7, k; from this (3.15) follows immediately. 


Lemma 3.6 shows that the only possible class of extreme forms is that 
represented by 


fo(x1, X2, x3) = x} + x} + x3 + (xi - x2)" + (m1 — x3)" + (x2 — x3)"; 


and (1.4) of Theorem 1 is simply verified for f = f, by substituting p,, = 1 in 
the formulae of §2. 

Hence to complete the proof of Theorem 1 we have only to show that fo 
is in fact extreme. A direct proof of this is not difficult, but is rather tedious. 
It is simpler to appeal to a general theorem of Hlawka (2) which asserts the 
existence of a most economical lattice-covering of space, and hence the exis- 
tence of a class of absolutely extreme forms (which can only be the class 


of fo) ° 


4. Remarks on the method. Voronoi (3; 4; 5) has given two distinct 
methods of reduction of positive quadratic forms. The first is based on the 
concept of perfect forms, and leads to a finite number of regions Ro, R;,..., R- 
in the 4n(n + 1)-dimensional coefficient space, with the properties: (i) any 
form is equivalent to a form lying in one of the regions R; (ii) no two forms 
lying in the interior of different regions are equivalent. 


—_— 


— 





“? 





THE COVERING OF SPACE BY SPHERES 303 


The second is based on the consideration of types of space-filling polytopes 
(which may be derived from positive forms, as we derived II from f in §2), 
and leads to regions R’», R’:,..., R’, having the same two properties. 


The “principal regions’ Ro, R’s are derived respectively from the perfect 
form 


oo = p> xi+ p> Ky 
| 
and its adjoint, a multiple of 


fro= n>, xi- 2>> X53 
1 i<j 
and in fact Ry = R’o. 
For n = 2 and m = 3, Ry = R's is the only region, and we obtain for n = 3 


the definition of reduction used in §2. For general m > 2, Ro is the set of forms 
expressible as 


(4.1) f(x) = p> pals — x4)", pry > 0 (i,j = 0,1,...,0), xo = 0. 


It is to be noted that the regions R or R’ do not possess the property that 
no two forms interior to the same region are equivalent; for example, the 
result of Lemma 2.1 generalizes in the obvious way for the form (4.1). This 
fact, which (as Voronoi remarks) is normally a disadvantage in a method of 
reduction, is clearly seen from the analysis of §§2 and 3 to be of considerable 
advantage in the problem we have been investigating. What Voronoi’s second 
method of reduction achieves is the specification of the broadest type of forms 
whose polytopes II (when not degenerate) are defined by the same set of 
integral points /; there is therefore little doubt that this method of reduction 
is best suited to the covering problem for each m > 2. 

In conclusion, it is perhaps worth noting that the case m = 2 (for which 
there is just one region Ry = R’) is very simply settled by these methods, 
and leads to 


THEOREM 2. If nm = 2, there is just one class of extreme forms, represented 


by 


folxr, x2) = xi + x — xix2, 


(3 ) 
m(f) = a7 P . 
We take f in Ro, i.e. 


S (x1, x2) = ports + poss + pio(xi — x2)’, Puy > O,7 


and for this class 


for which 
D = poipo2 + porpia + pops, 


4Dm(f) = 4Df(v) = D(po1 + por + piz) — fork ork ia, 
(the value of f(v) being the same for all vertices » of II). 





304 E. S. BARNES 


If por > por, we take Spo. = — €, Spor = €, ¢€ > 0, whence trivially D is 
increased and 


4f(v) = (p01 + por) + piz(por + po2)/D 


is not increased; thus f cannot be extreme. 
By symmetry it now follows that, for extreme f, we require po. = poz, and 
SO poi = por = piz. Theorem 2 follows at once. 


REFERENCES 


1. R. P. Bambah, On lattice coverings by spheres. Proc. Nat. Inst. Sci. India, 20 (1954), 25-52. 

2. E. Hlawka, Ausfillung und Uberdeckung konvexer Kérper durch konvexe Kérper, Monatsh. 
Math. Phys., 53 (1949), 81-131. 

3. G. Voronoi, Sur quelques propriétés des formes quadratiques positives parfaites, J. reine angew. 
Math. 133 (1907), 97-178. 

4. , Recherches sur les paralléloédres primitifs (Part 1), ibid., 134 (1908), 198-287. 

5. ——— (Part 2), ibid., 136 (1909), 67-181. 


University of Sydney 








We 




















THE MATHEMATICAL 
EXPOSITIONS SERIES 





JUST PUBLISHED 











PARTIAL DIFFERENTIAL EQUATIONS 
G. F. D. Durr 


Assistant Professor of Mathematics, University of Toronto 


AT THE present time there is a notable lack of textbooks describing 
the many new advances in partial differential equations over the past 
twenty years. Even the modern point of view towards the classical 
part of the subject has not been systematically treated in any textbook 
in English. 

This book is an attempt to make available to the student a coherent 
modern view of the theory of partial differential equations. Here 
equations of the first order and linear second order equations are 
treated by means of the tensor calculus which combines generality 
and insight. Since the book is self-contained, much of the material is 
classical, but an effort has been made to achieve a modern outlook on 
these topics. A number of significant recent developments are intro- 
duced, and treated in relation to the natural background formed by 
geometry and physics. 

Special features of the exposition are (a) the simplified general 
treatment of first order equations, (b) the geometrical foundations of 
the theory of linear second order equations, (c) unified treatment of 
boundary value problems and related topics by integral equations, 
(d) the theory of generalized hyperbolic potentials. 


viii + 248 pages $6.50 


UNIVERSITY OF TORONTO PRESS 























MATHEMATICAL BOOKS 
from MACMILLAN 


A Third Edition 
METHODS OF MATHEMATICAL PHYSICS 


By Sir Harold and Lady Jeffreys ix +715 pp. Published $14.25 


Cambridge Monographs on Mechanics and Applied 
Mathematics 


THE THEORY OF HYDRODYNAMIC STABILITY 
By C. C. Lin xii +- 156 pp., 1 table, 26 text-figures 


The first monograph — on problems of the stability of steady fluid 
flow, this gives a definite survey, bearing in mind both the physical and 
the mathematical aspects. Among others, it treats of some problems of 
hydro-dynamic stability arising in geo- and astrophysics. 

Published $8.50 


SURVEYS IN MECHANICS 
By G. K. Batchelor vii + 476 pp., 16 plates, 130 text-figures, 22 tables 


A collection of surveys of the present position of research in some branches 
of mechanics, written in commemoration of the 70th birthday of Geoffrey 
Ingram Taylor. 


Published $8.50 


Tracts for Computors no. xxvi 


CORRELATED RANDOM NORMAL DEVIATES 
xvi + 76 pp., paper covers 


3,000 sets of deviates, each giving 9 random pairs with correlations 0.1, 
0.2, . . . , 0.9, compiled from Herman Wold’s Table for Random Normal 
Deviates (tract no. xxv). 


Published $1.80 


Reprinted 
DIFFERENTIAL GEOMETRY OF THREE DIMENSIONS, Volume I 


By C. E. Weatherburn xii + 268 pp., 27 text-figures 
This important book is now in print for the first time since 1950. 
Published $3.40 


THE MACMILLAN COMPANY OF CANADA LIMITED 
70 Bond Street Toronto 2, Ont. 











